Track 1 – EEG-to-Image (cross-stimulus retrieval)

Given EEG recorded while a participant views a natural image, decode the image identity. The competition tests cross-stimulus generalisation: training images and test images do not overlap.

  • Shift: seen images -> unseen images.

  • Headline metric: Top-5 retrieval accuracy in a frozen DINOv2 embedding space (higher is better).

  • Data: THINGS-EEG1 + THINGS-EEG2 + Alljoined-1 + Alljoined-1.6M (88 subjects, research- and consumer-grade hardware). The hidden evaluation set is provided by Alljoined and uses Emotiv hardware.

NeuralBench mapping

The closest task in NeuralBench is Image decoding.

  • CLI: neuralbench eeg image

  • Default dataset: Gifford2022Large (THINGS-EEG2, 10 subjects, 63 channels). This is one of the four datasets the competition uses.

  • Target: frozen facebook/dinov2-giant image embeddings (1536-d), aligned with a CLIP contrastive loss – the same embedding space the competition scorer uses.

  • Headline metric key: test/batch_top5_acc (Top-5 retrieval).

Show tasks/eeg/image/config.yaml
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

data:
  study:
    source:
      name: Gifford2022Large
    split:
      name: PredefinedSplit
      test_split_query: null
      col_name: split
      valid_split_by: timeline
      valid_split_ratio: 0.2
      valid_random_state: 33
  neuro.baseline: [0.0, 0.2]
  target:
    name: HuggingFaceImage
    model_name: facebook/dinov2-giant
    layers: 0.6667
    token_aggregation: mean
    imsize: 518
    aggregation: trigger
    infra:
      cluster: auto
      keep_in_ram: false
      timeout_min: 180
      gpus_per_node: 1
      cpus_per_task: 10
      min_samples_per_job: 64
  trigger_event_type: Image
  start: -0.2
  duration: 1.0
  summary_columns: [category, filepath]
brain_model_output_size: &brain_model_output_size 1536
trainer_config.monitor: val/batch_top5_acc
trainer_config.mode: max
loss:
  name: ClipLoss
  norm_kind: y
  temperature: false
  symmetric: false
metrics: !!python/name:neuralbench.defaults.metrics.retrieval_metrics
test_full_retrieval_metrics: !!python/name:neuralbench.defaults.metrics.test_full_retrieval_metrics

Reproducing the baseline

# 1. Download THINGS-EEG2
neuralbench eeg image --download

# 2. Build the preprocessing + DINOv2 target cache
neuralbench eeg image --prepare

# 3. Quick local sanity check (2 epochs, subsampled)
neuralbench eeg image --debug

# 4. Full baseline -- task-specific model (EEGNet)
neuralbench eeg image -m eegnet

Step 2 is the most expensive: it computes one DINOv2 embedding per unique stimulus and caches it under CACHE_DIR.

Where the competition data diverges

The starter kit ships four relevant datasets you can use to develop and evaluate your model, while the competition itself evaluates on a hidden Alljoined Emotiv test set. The four available training sources are Gifford2022Large (THINGS-EEG2, the default), Grootswagers2022Human (THINGS-EEG1), Xu2024Alljoined (Alljoined-1) and Xu2025Alljoined (Alljoined-1.6M). They give directionally correct baselines but not the exact competition numbers (the hidden Emotiv test set is not public).

The three non-default sources are registered under tasks/eeg/image/datasets/ and can be selected with --dataset:

neuralbench eeg image --dataset grootswagers2022human
neuralbench eeg image --dataset xu2024alljoined
neuralbench eeg image --dataset xu2025alljoined

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery