Track 5 – Foundation Transfer (one EEG encoder, three tasks)¶

Track 5 evaluates the central foundation-model claim: does a single EEG encoder transfer across heterogeneous EEG tasks without task-specific re-training? Participants submit one encoder \(g_{\phi}\); the organisers freeze it and linear-probe it for each of the three EEG tracks (Image, BCI, Sleep), then auto-enrol the submission on those leaderboards under the foundation-model label.

Shift: cross-task, cross-device, cross-montage.
Headline metric: mean rank of the submission across the three EEG leaderboards (lower is better), \(S_{\mathrm{FM}}(a) = \frac{1}{3} \sum_{t} r_{a,t}\).
Rule: the encoder must be a single set of weights. It may read the per-example metadata \(m\) (subject, session, track, channel layout) and may be channel-aware, but bundling separately trained sub-encoders is not allowed.

How Track 5 is scored¶

Submitting an encoder to Track 5 automatically produces three leaderboard entries. For each EEG track, the organisers freeze the encoder weights and fit a linear probe on top of the encoder features:

Track 1 (EEG-to-Image) – linear probe on \(g_{\phi}(X, m)\), plugged into the same Top-5 retrieval scorer as Track 1.
Track 2 (EEG-to-BCI) – linear probe on the same encoder features, scored with balanced accuracy.
Track 3 (Sleep onset) – linear probe on the same encoder features, scored with onset MAE.

The submission’s Track 5 score is the mean of its ranks on those three leaderboards. Ties are broken by jointly bootstrapping the three leaderboards (see the competition proposal for details).

Practically: a strong Track 5 submission needs to be competitive on all three EEG tracks at the same time, with frozen encoder weights and a single channel-handling mechanism. No track-specific fine-tuning of the encoder is allowed.

Note

The official Track 5 protocol freezes the encoder and fits a linear probe per track. NeuralBench currently runs foundation models with full fine-tuning (the encoder is trained jointly with the head), so the baselines below approximate Track 5 rather than reproduce its exact frozen-encoder protocol. Frozen-encoder linear probing is coming soon; until it lands, treat these numbers as an upper-bound proxy for the transfer signal.

Reproducing the reference baseline (REVE)¶

REVE is the reference Track 5 baseline: a single set of weights evaluated across Tracks 1-3. To reproduce its three leaderboard entries with NeuralBench:

# Image: Top-5 accuracy
neuralbench eeg image -m reve

# BCI: balanced accuracy (starter-kit dataset, see Track 2)
neuralbench eeg motor_imagery -m reve

# Sleep onset: recording-level MAE
neuralbench eeg sleep_onset -m reve

Each command produces a test/<metric> value (batch_top5_acc, bal_acc, bmae). REVE’s three values are reported in Overview: EEG/EMG Foundation Challenge 2026.

To compare against task-specific models on the same three tracks:

neuralbench eeg image           -m all_classic all_fm
neuralbench eeg motor_imagery   -m all_classic all_fm
neuralbench eeg sleep_onset     -m all_classic all_fm

Then use the results tutorial (or the --plot-cached workflow described in Overview: EEG/EMG Foundation Challenge 2026) to aggregate ranks across the three tracks.

Submitting your own foundation model¶

Track 5 submissions are evaluated by the organisers; the encoder itself runs locally during preparation and is then packaged for submission. From NeuralBench’s perspective, the encoder is a foundation model registered under models/<your_model>.yaml with a downstream_model_wrapper (the same wiring used by reve.yaml, labram.yaml, bendr.yaml, biot.yaml, cbramod.yaml, luna.yaml). With NeuralBench’s current support, the encoder is fine-tuned end-to-end with the downstream head; frozen-encoder linear probing (the official Track 5 protocol) is coming soon.

See Adding a New Model for the model-registration walkthrough. The same encoder config can then be passed to all three EEG tracks via -m <your_model>.

Rules cheat-sheet (from the competition proposal)¶

One set of weights. No separately trained sub-encoders, no metadata-conditional routing among separately trained components.
The encoder is frozen at submission time; only the organisers’ linear probes adapt to each track.
Channel handling (e.g., padding to a canonical montage, learned channel embeddings from electrode positions) is a single shared component.
Pre-training data is unrestricted but must be disclosed in the methods report.
A Track 5 team may not also submit specialised models to Tracks 1-3 in their own name; the organisers handle those entries on the Track 5 encoder’s behalf.
Track 4 (EMG) is excluded from Track 5 – EMG and EEG differ in sampling rate, substrate, and montage.

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery