neuralbench.transforms.SklearnSplit¶

pydantic model neuralbench.transforms.SklearnSplit[source][source]¶

Perform train/val/test split using sklearn’s train_test_split.

Parameters:

split_by (str) – Column name to use for splitting by (e.g., ‘timeline’ or ‘subject’). If set to “_index” and “_index” is not in the events dataframe, the events dataframe will be reset to have a new column with row indices, named “_index”.
valid_split_ratio (float) – Ratio of the full dataset to use for validation.
test_split_ratio (float) – Ratio of the full dataset to use for testing.
valid_random_state (int) – Random state for validation split.
test_random_state (int) – Random state for test split.
stratify_by (str | None) – Column name to use for stratified splitting. If None, no stratification is applied.

Fields: