neuralfetch

neuralfetch

One interface to fetch neuroimaging datasets from OpenNeuro, DANDI, OSF, HuggingFace, Zenodo, and more — into a unified events DataFrame ready for neuralset.

pip install neuralfetch

Installing neuralfetch automatically registers all curated studies in neuralset’s catalog — no extra imports needed.


Quickstart

Pick a sample dataset and see the full fetch-to-events workflow.

Download & load events

Tutorials

Each tutorial walks through one building block of the neuralfetch pipeline.

Fetch Your First Study
Browse the catalog, download a sample dataset, and preview the events DataFrame.
study = ns.Study(name="Grootswagers2022HumanSample",
                    path="./data")
study.download()
events = study.run()
print(events[["type","start","duration"]].head())
Study Anatomy
Timelines, event types, subject metadata, and how to compose studies with transforms.
for tl in study.iter_timelines():
    print(tl)   # {subject: "sub-01", session: "..."}

events = study.run()
print(events["type"].value_counts())
Create Your Own Study
Wrap any local or remote dataset as a Study subclass and register it in the catalog.
class MyStudy(studies.Study):
 def iter_timelines(self):
     yield {"subject": "sub-01"}
 def _load_timeline_events(self, tl):
     return pd.DataFrame([...])

How It Works

neuralfetch connects to 12 public repositories through a pluggable backend system and returns the same tidy events DataFrame regardless of the source.

DANDI DataLad Donders Dryad EEGDash Figshare HuggingFace OpenNeuro OSF PhysioNet Synapse Zenodo
↓ ↓ ↓ ↓ ↓
neuralfetch
neuralset Events DataFrame

The result is always the same: a tidy DataFrame where each row is an event (brain recording, word, audio, image, stimulus…) with timing, subject, and session information — regardless of where the data came from.

import neuralset as ns

study = ns.Study(name="Gwilliams2022Neural", path="/data")  # MEG + speech, from OSF
events = study.run()
events[["type", "start", "duration", "subject", "text"]].head()
type   start  duration              subject          text
Meg      0.0     396.0  Gwilliams2022Neural/A0001           NaN
Audio    0.0      42.3  Gwilliams2022Neural/A0001           NaN
Word     1.52     0.22  Gwilliams2022Neural/A0001         there
Word     1.74     0.18  Gwilliams2022Neural/A0001           was
Word     1.92     0.08  Gwilliams2022Neural/A0001             a

Supported Data Sources

neuralfetch downloads datasets from 12 public repositories through a pluggable backend system. Each backend handles authentication, pagination, and format differences so your code doesn’t have to.

Backend

Repository

What’s there

Dandi

DANDI Archive

NWB neurophysiology datasets (EEG, ephys, calcium imaging)

Datalad

DataLad

Git-annex managed large datasets

Donders

Donders Repository

Donders Institute data collections

Dryad

Dryad

Curated, open-access research data repository

Eegdash

EEGDash

Cloud-hosted EEG datasets with REST API

Figshare

Figshare

Research data sharing platform

Huggingface

HuggingFace Hub

ML-ready datasets (neural, audio, text)

Openneuro

OpenNeuro

1 000+ BIDS datasets — EEG, MEG, fMRI, iEEG

Osf

Open Science Framework

General-purpose research data hosting

Physionet

PhysioNet

Physiological signal databases (EEG, ECG, EMG)

Synapse

Synapse

Collaborative research platform (auth required)

Zenodo

Zenodo

CERN-backed open-data archive

Each study declares which backend to use. You never interact with backends directly — just call study.run() and the data is downloaded, cached, and returned as events.


The Events DataFrame

Every study produces the same output: a pandas DataFrame where each row is a time-stamped event. Different event types (brain data, words, audio, stimuli) coexist in the same table, identified by the type column.

Core columns (always present):

Column

Type

Description

type

str

Event class: Meg, Eeg, Fmri, Word, Audio, Image, Stimulus, …

start

float

Onset time in seconds

duration

float

Duration in seconds

stop

float

start + duration (auto-computed)

timeline

str

Unique recording identifier

subject

str

Subject ID (BIDS format: StudyName/sub_id)

session

str

Session ID (BIDS entity, "" if unused)

task

str

Task ID (BIDS entity, "" if unused)

run

str

Run ID (BIDS entity, "" if unused)

Type-specific columns are added depending on event type:

  • Brain data (Meg, Eeg, Fmri, …): filepath, frequency

  • Text (Word, Sentence, Phoneme): text, language, modality, context

  • Audio / Video: filepath, frequency, offset

  • Categorical (Stimulus, Action, SleepStage, …): code, description, state

Example: an MEG study with speech stimuli
──────────────────────────────────────────
type      start  duration  subject           filepath               text     language
Meg         0.0     300.0  Study/s01         /data/s01_raw.fif      NaN      NaN
Audio       0.0     300.0  Study/s01         /data/audio.wav        NaN      NaN
Word        1.50      0.3  Study/s01         NaN                    hello    english
Word        2.10      0.2  Study/s01         NaN                    world    english
Sentence    1.50      0.8  Study/s01         NaN                    hello..  english
Stimulus    5.00      0.1  Study/s01         NaN                    NaN      NaN

This flat representation makes it easy to filter, group, and join events across modalities using standard pandas operations.


Supported Datasets

neuralfetch provides interfaces to public datasets spanning six recording modalities, with new datasets added regularly:

Modality

Device

Example studies

MEG

Meg

Gwilliams2022Neural, Hebart2023ThingsMeg, Armeni2022Sherlock, …

EEG

Eeg

Gifford2022Large, Grootswagers2022Human, Hollenstein2018Zuco, …

fMRI

Fmri

Lebel2023Natural, Hebart2023ThingsBold, Allen2022Massive, …

iEEG

Ieeg

Wang2024Treebank, …

EMG

Emg

Sivakumar2024Emg2qwerty, …

fNIRS

Fnirs

Luke2021Analysis, …

Each study provides:

  • Download logic — automatically fetches from the right repository

  • Timeline iteration — yields subject × session × run combinations

  • Event parsing — converts raw annotations into the events DataFrame

  • Validated metadata — expected shapes, frequencies, event types, subject counts


Browse All Datasets

Filter and sort all available datasets by modality, subjects, and hours:

Sample datasets are available for immediate use — no large download required: