neuralfetch¶
neuralfetch
One interface to fetch neuroimaging datasets from OpenNeuro, DANDI, OSF, HuggingFace, Zenodo, and more — into a unified events DataFrame ready for neuralset.
pip install neuralfetch
Installing neuralfetch automatically registers all curated studies in neuralset’s catalog — no extra imports needed.
Quickstart¶
Pick a sample dataset and see the full fetch-to-events workflow.
Tutorials¶
Each tutorial walks through one building block of the neuralfetch pipeline.
How It Works¶
neuralfetch connects to 12 public repositories through a pluggable backend system and returns the same tidy events DataFrame regardless of the source.
Events DataFrame
The result is always the same: a tidy DataFrame where each row is an event (brain recording, word, audio, image, stimulus…) with timing, subject, and session information — regardless of where the data came from.
import neuralset as ns
study = ns.Study(name="Gwilliams2022Neural", path="/data") # MEG + speech, from OSF
events = study.run()
events[["type", "start", "duration", "subject", "text"]].head()
type start duration subject text
Meg 0.0 396.0 Gwilliams2022Neural/A0001 NaN
Audio 0.0 42.3 Gwilliams2022Neural/A0001 NaN
Word 1.52 0.22 Gwilliams2022Neural/A0001 there
Word 1.74 0.18 Gwilliams2022Neural/A0001 was
Word 1.92 0.08 Gwilliams2022Neural/A0001 a
Supported Data Sources¶
neuralfetch downloads datasets from 12 public repositories through a pluggable backend system. Each backend handles authentication, pagination, and format differences so your code doesn’t have to.
Backend |
Repository |
What’s there |
|---|---|---|
|
NWB neurophysiology datasets (EEG, ephys, calcium imaging) |
|
|
Git-annex managed large datasets |
|
|
Donders Institute data collections |
|
|
Curated, open-access research data repository |
|
|
Cloud-hosted EEG datasets with REST API |
|
|
Research data sharing platform |
|
|
ML-ready datasets (neural, audio, text) |
|
|
1 000+ BIDS datasets — EEG, MEG, fMRI, iEEG |
|
|
General-purpose research data hosting |
|
|
Physiological signal databases (EEG, ECG, EMG) |
|
|
Collaborative research platform (auth required) |
|
|
CERN-backed open-data archive |
Each study declares which backend to use. You never interact with backends
directly — just call study.run() and the data is downloaded, cached, and
returned as events.
The Events DataFrame¶
Every study produces the same output: a pandas DataFrame where each row
is a time-stamped event. Different event types (brain data, words, audio,
stimuli) coexist in the same table, identified by the type column.
Core columns (always present):
Column |
Type |
Description |
|---|---|---|
|
str |
Event class: |
|
float |
Onset time in seconds |
|
float |
Duration in seconds |
|
float |
|
|
str |
Unique recording identifier |
|
str |
Subject ID (BIDS format: |
|
str |
Session ID (BIDS entity, |
|
str |
Task ID (BIDS entity, |
|
str |
Run ID (BIDS entity, |
Type-specific columns are added depending on event type:
Brain data (
Meg,Eeg,Fmri, …):filepath,frequencyText (
Word,Sentence,Phoneme):text,language,modality,contextAudio / Video:
filepath,frequency,offsetCategorical (
Stimulus,Action,SleepStage, …):code,description,state
Example: an MEG study with speech stimuli
──────────────────────────────────────────
type start duration subject filepath text language
Meg 0.0 300.0 Study/s01 /data/s01_raw.fif NaN NaN
Audio 0.0 300.0 Study/s01 /data/audio.wav NaN NaN
Word 1.50 0.3 Study/s01 NaN hello english
Word 2.10 0.2 Study/s01 NaN world english
Sentence 1.50 0.8 Study/s01 NaN hello.. english
Stimulus 5.00 0.1 Study/s01 NaN NaN NaN
This flat representation makes it easy to filter, group, and join events across modalities using standard pandas operations.
Supported Datasets¶
neuralfetch provides interfaces to public datasets spanning six recording modalities, with new datasets added regularly:
Modality |
Device |
Example studies |
|---|---|---|
MEG |
|
Gwilliams2022Neural, Hebart2023ThingsMeg, Armeni2022Sherlock, … |
EEG |
|
Gifford2022Large, Grootswagers2022Human, Hollenstein2018Zuco, … |
fMRI |
|
Lebel2023Natural, Hebart2023ThingsBold, Allen2022Massive, … |
iEEG |
|
Wang2024Treebank, … |
EMG |
|
Sivakumar2024Emg2qwerty, … |
fNIRS |
|
Luke2021Analysis, … |
Each study provides:
Download logic — automatically fetches from the right repository
Timeline iteration — yields subject × session × run combinations
Event parsing — converts raw annotations into the events DataFrame
Validated metadata — expected shapes, frequencies, event types, subject counts
Browse All Datasets¶
Filter and sort all available datasets by modality, subjects, and hours:
Sample datasets are available for immediate use — no large download required: