neuralbench.transforms.SklearnSplit

pydantic model neuralbench.transforms.SklearnSplit[source][source]

Perform train/val/test split using sklearn’s train_test_split.

Parameters:
  • split_by (str) – Column name to use for splitting by (e.g., ‘timeline’ or ‘subject’). If set to “_index” and “_index” is not in the events dataframe, the events dataframe will be reset to have a new column with row indices, named “_index”.

  • valid_split_ratio (float) – Ratio of the full dataset to use for validation.

  • test_split_ratio (float) – Ratio of the full dataset to use for testing.

  • valid_random_state (int) – Random state for validation split.

  • test_random_state (int) – Random state for test split.

  • stratify_by (str | None) – Column name to use for stratified splitting. If None, no stratification is applied.

Fields:
field split_by: str = 'timeline'[source]
field valid_split_ratio: float = 0.2[source]
field test_split_ratio: float = 0.2[source]
field valid_random_state: int = 33[source]
field test_random_state: int = 33[source]
field stratify_by: str | None = None[source]
requirements: tp.ClassVar[tuple[str, ...]] = ()[source]