Track 3 – Sleep onset (cross-subject latency prediction)

Given a continuous wearable EEG recording, predict the latency (in seconds, from recording start) at which the participant transitions into stable sleep. The competition tests cross-subject generalisation: train sleepers and test sleepers are disjoint.

  • Shift: seen sleepers -> unseen sleepers.

  • Headline metric: recording-level mean absolute error in seconds (lower is better). Tolerance rates within 30 / 60 / 300 s are reported as diagnostics.

  • Data: continuous Muse wearable EEG, ~1000 training subjects, hidden evaluation set of the same order of magnitude. The reference onset is the first annotated N2 event (or equivalently the first non-Wake epoch satisfying a fixed persistence rule).

Note

The Muse training set will be released by InteraXon for the competition. Until then, this starter kit runs on the Sleep-EDF dataset (Kemp2000Analysis) – the data format and target extractor are identical, only the recording hardware differs.

NeuralBench mapping

  • CLI: neuralbench eeg sleep_onset

  • Default dataset: Kemp2000Analysis (Sleep-EDF Expanded, 78 nights, 2 EEG channels, full polysomnography).

  • Target: latency from recording start to the first N2 epoch, extracted by AddSleepOnsetTargets + SleepOnsetTargetExtractor and capped at 600 s. This matches the competition’s reference definition.

  • Headline metric key: test/bmae (binned MAE in seconds).

Show tasks/eeg/sleep_onset/config.yaml
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

data:
  study:
    source:
      name: Kemp2000Analysis
    annotate_sleep_onset:
      name: AddSleepOnsetTargets
      max_pre_n2_s: 1200.0
    split:
      name: SklearnSplit
      split_by: subject
      valid_split_ratio: 0.2
      test_split_ratio: 0.2
      valid_random_state: 33
      test_random_state: 33
  # Sleep-EDF (Kemp2000) recordings only expose bipolar derivations (Fpz-Cz,
  # Pz-Oz) whose MNE ch_locs are NaN, so the default ``ch_locs``-derived
  # position mapping in ``ChannelPositions`` produces all-``INVALID_VALUE``
  # positions.  Explicitly use the standard_1020 montage so the split-on-``-``
  # fallback resolves Fpz-Cz -> Fpz and Pz-Oz -> Pz to real 3D coordinates.
  channel_positions:
    layout_or_montage_name: standard_1020
  target:
    =replace=: true
    name: SleepOnsetTargetExtractor
    event_types: SleepOnsetMarker
    aggregation: trigger
    cap_s: 600.0
  trigger_event_type: SleepOnsetMarker
  start: 0.0
  duration: 5.0
  stride: 5.0
  sampler:
    name: RegressionBinSampler
    bin_edges: [0.0, 40.0, 90.0, 300.0, 600.0]
  summary_columns: [n2_onset]
brain_model_output_size: &brain_model_output_size 1
trainer_config:
  monitor: val/bmae
  mode: min
  patience: 7
  n_epochs: 40
loss:
  name: MultiLoss
  losses:
    mse:
      name: MSELoss
  weights:
    mse: 0.0001  # targets are in [0, 600] s; weight=0.0001 (0.01^2) keeps MSE <= ~36 so gradient clipping does not saturate.
metrics: !!python/object/apply:neuralbench.defaults.metrics.get_sleep_onset_metric_configs []

Reproducing the baseline

# 1. Download Sleep-EDF
neuralbench eeg sleep_onset --download

# 2. Prepare the preprocessing cache
neuralbench eeg sleep_onset --prepare

# 3. Quick local sanity check
neuralbench eeg sleep_onset --debug

# 4. Full baseline -- task-specific model (EEGNet)
neuralbench eeg sleep_onset -m eegnet

Where the competition data diverges

Sleep-EDF and the Muse competition data differ on three axes that matter at training time:

  1. Hardware: research-grade PSG (Sleep-EDF) vs consumer-grade Muse headband (4-channel frontal EEG, +/- accelerometer, no EOG). Expect to drop or re-map channels in the dataloader.

  2. Cohort and recording context: laboratory monitored sleep vs home recordings with movement artifacts and impedance changes.

  3. Annotations: full hypnograms vs n2_onset events only on the training set. NeuralBench already trains on SleepOnsetMarker events, so the model interface does not change.

Two additional polysomnography datasets are already registered under tasks/eeg/sleep_onset/datasets/ and use the same AddSleepOnsetTargets + bmae pipeline as the default. They are useful for stress-testing cross-subject behaviour on more sleepers:

neuralbench eeg sleep_onset --dataset ghassemi2018you
neuralbench eeg sleep_onset --dataset alvarez2022haaglanden

Once the Muse study is registered, switching is a single data.study.source.name: Interaxon2026Muse override (or --dataset interaxon2026muse if a datasets/ YAML ships).

Submission outputs (per the competition):

  • a direct onset estimate tau_hat in seconds, or

  • per-window time-to-onset predictions, or

  • per-window sleep probabilities.

The current NeuralBench head produces the first format directly.

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery