Adding a New Downstream Task

Add a brain-modeling task to NeuralBench. Every task lives in its own directory under tasks/{device}/{task_name}/ and is defined by a single config.yaml that overrides the global defaults.

Task directory layout

Each task needs at minimum a config.yaml. An optional grid.yaml adds task-specific hyperparameter sweeps.

neuralbench/tasks/eeg/
    my_new_task/
        config.yaml          # required
        grid.yaml            # optional (hyperparameter grid)
        datasets/            # optional (per-dataset overrides)
            variant_a.yaml

Once the directory exists the CLI picks it up automatically:

neuralbench eeg my_new_task --debug

Directories prefixed with _ (e.g. _my_new_task) are treated as unvalidated and skipped when running neuralbench eeg all.

Anatomy of a task config

A config.yaml overrides parts of the base config (defaults/config.yaml). The main sections are:

  1. Data — which dataset to load, how to split, and what to predict.

  2. Model output & loss — number of output neurons and loss function.

  3. Metrics — evaluation metrics.

Here is a complete example for a binary classification task.

data:
  study:
    study:
      name: Mumtaz2018Machine        # Study class in neuralfetch.studies
    transforms:
      split:
        name: SklearnSplit
        split_by: subject             # Leave-subjects-out split
        valid_split_ratio: 0.2
        test_split_ratio: 0.2
        valid_random_state: 33
        test_random_state: 33
        stratify_by: diagnosis        # Stratify on the target label

  target:
    =replace=: true                   # Replace the default target config entirely
    name: LabelEncoder
    event_types: Eeg
    event_field: diagnosis            # Column in the events DataFrame
    return_one_hot: true
    aggregation: first

  trigger_event_type: Eeg             # Event type that triggers a segment
  start: 0.0                          # Segment start (seconds relative to trigger)
  duration: 5.0                       # Segment duration (seconds)
  stride: 5.0                         # Stride for continuous windowing

model_output_size: &model_output_size 2

loss:
  name: CrossEntropyLoss
  kwargs:
    label_smoothing: 0.1

metrics: !!python/object/apply:neuralbench.defaults.metrics.get_classification_metric_configs
  - *model_output_size

Key fields explained

data.study.study.name

The name of the Study class (from NeuralFetch) that loads the dataset.

data.study.transforms.split

How to split the data. Common splitters:

  • SklearnSplit — random split, optionally stratified, by subject, index, or timeline.

  • PredefinedSplit — split on a column value (e.g. release == 'R5' for a held-out set).

data.target

What the model predicts. Common target extractors:

  • LabelEncoder — classification labels from an event field.

  • EventField — raw numeric value (for regression).

  • HuggingFaceText / Wav2Vec / HuggingFaceImage — embedding-based targets (for retrieval tasks).

trigger_event_type

Which event type marks the start of each trial/segment.

start / duration / stride

Windowing parameters. stride enables overlapping or contiguous windows; omit it for event-locked segments.

=replace=: true

Tells the config merger to replace the subtree entirely instead of deep-merging with the defaults.

!!python/object/apply:...

Calls a Python factory function at config load time. The metric factories in neuralbench.defaults.metrics return standard metric configs for classification, regression, or retrieval.

Adding a hyperparameter grid

Create a grid.yaml alongside the config.yaml. Keys use dot-notation and map to lists of values:

data.batch_size: [16, 32]
lightning_optimizer_config.optimizer.lr: [1.0e-3, 5.0e-4, 1.0e-4]

The grid is expanded with --grid:

neuralbench eeg my_new_task --grid

Adding a documentation page

Each task should have a matching .rst page under docs/neuralbench/tasks/eeg/. A template is provided at docs/neuralbench/tasks/_TEMPLATE.rst. Copy it to docs/neuralbench/tasks/eeg/{task_name}.rst and fill in the task-specific details.

Checklist

Create tasks/{device}/{task_name}/config.yaml

(Optional) Create grid.yaml with task-specific sweeps

Add a debug query in defaults/debug_study.yaml for fast iteration

Verify with neuralbench {device} {task_name} --debug

Add a .rst doc page under docs/neuralbench/tasks/

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery