Sleep stage classification

Name: sleep stage
Category: sleep
Dataset: Kemp2000 (SleepEDFx)
Objective: Multiclass classification
Split: Leave-subjects-out

Usage

neuralbench eeg sleep_stage
Show config.yaml
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

data:
  study:
    source:
      name: Kemp2000Analysis
    crop_sleep_recordings:
      name: CropSleepRecordings
      max_wake_duration_min: 30.0
    split:
      name: SklearnSplit
      split_by: subject
      valid_split_ratio: 0.2
      test_split_ratio: 0.2
      valid_random_state: 33
      test_random_state: 33
  # Sleep-EDF (Kemp2000) recordings only expose bipolar derivations (Fpz-Cz,
  # Pz-Oz) whose MNE ch_locs are NaN, so the default ``ch_locs``-derived
  # position mapping in ``ChannelPositions`` produces all-``INVALID_VALUE``
  # positions.  Explicitly use the standard_1020 montage so the split-on-``-``
  # fallback resolves Fpz-Cz -> Fpz and Pz-Oz -> Pz to real 3D coordinates.
  channel_positions:
    layout_or_montage_name: standard_1020
  # Force serial MNE resample/filter for the whole task so everything is
  # in-process (no worker, no re-import, no lock); plenty fast for the
  # cached per-timeline extraction.
  neuro:
    mne_cpus: 1
  target:
    =replace=: true
    name: LabelEncoder
    allow_missing: true
    event_types: SleepStage
    event_field: stage
    return_one_hot: true
  trigger_event_type: SleepStage
  start: 0.0
  duration: 30.0
  stride: 30.0
  stride_drop_incomplete: true
  summary_columns: [stage]
compute_class_weights: true
brain_model_output_size: &brain_model_output_size 5
trainer_config.monitor: val/bal_acc
trainer_config.mode: max
loss:
  name: CrossEntropyLoss
  kwargs:
    label_smoothing: 0.1
metrics: !!python/object/apply:neuralbench.defaults.metrics.get_classification_metric_configs
  - *brain_model_output_size

Description

The sleep stage classification task involves categorizing segments of EEG data into different sleep stages based on established sleep scoring criteria [Rechtschaffen1968]. The dataset used for this task is the Sleep-EDF dataset, which contains whole-night polysomnographic recordings from healthy subjects [Kemp2000].

Only sleep stage labels N1, N2, N3, REM, and Wake are considered for this task. Other labels such as Movement time and Unknown are excluded.

Dataset Notes

  • Recordings are cropped to start 30 minutes before the first occurrence of a sleep stage and end 30 minutes after the last occurrence of a sleep stage to avoid imbalances due to long periods of wakefulness at the beginning and end of the recordings.

Additional Datasets

The following additional polysomnography datasets can also be used with this task:

  • Ghassemi2018You (PhysioNet/CinC Challenge 2018) – 1,985 whole-night recordings annotated for sleep stage and arousal; only the labelled training split is used here [Ghassemi2018You].

  • Alvarez2022Haaglanden (HMC) – 151 whole-night polysomnographic recordings from clinical sleep-center patients [Alvarez2022Haaglanden].

To run with an alternate dataset:

neuralbench eeg sleep_stage --dataset alvarez2022haaglanden

References

[Rechtschaffen1968]

A Rechtschaffen, AE Kales. A manual of standardized terminology, techniques and scoring systems for sleep stages of human subjects. Los Angeles, CA: UCLA Brain Information Service. Brain Research Institute 10 (1968).

[Kemp2000]

B Kemp, AH Zwinderman, B Tuk, HAC Kamphuisen, JJL Oberyé. Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE-BME 47(9):1185-1194 (2000).

[Ghassemi2018You]

Ghassemi, Mohammad M., et al. “You Snooze, You Win: The PhysioNet/Computing in Cardiology Challenge 2018.” Computing in Cardiology 45 (2018).

[Alvarez2022Haaglanden]

Alvarez-Estevez, Diego, and Roselyne Rijsman. “Haaglanden Medisch Centrum Sleep Staging Database.” PhysioNet (2022).