Contributing: Onboarding a New Study¶
This guide walks you through adding a new study (dataset) to neuralfetch — the curated catalog of public brain datasets used by neuralset.
Tip
For general contributing guidelines (fork/PR workflow, pre-commit hooks, code quality), see the neuralset contributing guide.
Overview¶
Adding a study involves four steps:
Upload your data to a public repository supported by neuralfetch.
Create the study class following the neuralfetch format.
Validate and test the study locally.
Open a Pull Request for review.
1. Upload Data¶
neuralfetch supports downloading from many public data repositories. Choose the one that best fits your data:
Repository |
Best for |
Link |
|---|---|---|
OpenNeuro |
BIDS-formatted EEG/MEG/fMRI |
|
HuggingFace Hub |
Any modality, large files |
|
OSF |
Multi-file datasets, any format |
|
Zenodo |
Versioned archives, DOI minting |
|
Dryad |
Tabular + small neural data |
|
DANDI |
NWB-formatted neurophysiology |
|
PhysioNet |
Physiological signals (EEG, ECG, EMG) |
Key requirements:
Data must be publicly accessible (or at least requestable with a free account).
Include a clear licence (CC-BY-4.0, CC0-1.0, etc.).
Stimulus files should be included or separately downloadable.
2. Create the Study Class¶
Each study lives in a single Python file inside neuralfetch-repo/neuralfetch/studies/.
Naming Convention¶
Class name = PascalCase of the Google Scholar BibTeX key. Search your paper on Google Scholar, click “Cite” → “BibTeX”, and use the key verbatim. E.g.,
allen2022massive→Allen2022Massive.File name = lowercase class name, e.g.,
allen2022massive.py.
Minimal Template¶
import logging
import typing as tp
import pandas as pd
from neuralfetch import download
from neuralset.events import study
logger = logging.getLogger(__name__)
class AuthorYearKeyword(study.Study):
"""DatasetName: <modality> responses to <stimulus description>.
Brief summary (2-4 sentences). State population if not healthy adults.
Experimental Design:
- <Device info>
- N participants
- Paradigm: <brief description>
"""
aliases: tp.ClassVar[tuple[str, ...]] = ()
bibtex: tp.ClassVar[str] = """
@article{authoryearkeyword,
title={...},
author={...},
journal={...},
year={YYYY},
doi={10.xxxx/...}
}
"""
url: tp.ClassVar[str] = "https://..."
licence: tp.ClassVar[str] = "CC-BY-4.0"
description: tp.ClassVar[str] = (
"EEG recordings from N participants during task."
)
_info: tp.ClassVar[study.StudyInfo] = None # populated after validation
def _download(self) -> None:
# Each repository has a matching download class (Openneuro, Osf,
# Physionet, Huggingface, …) — see the API reference.
dl = download.Openneuro(study="dsXXXXXX", dset_dir=self.path / "download")
dl.download()
def iter_timelines(self) -> tp.Iterator[dict[str, tp.Any]]:
for sub_dir in sorted((self.path / "download").glob("sub-*")):
yield {"subject": sub_dir.name}
def _load_timeline_events(
self, timeline: dict[str, tp.Any]
) -> pd.DataFrame:
# Load events for a single subject/session
...
ClassVar Order¶
Always declare class variables in this order:
aliases → bibtex → url → licence → description → requirements → _info
3. Validate and Test¶
Run the study locally¶
import neuralset as ns
study = ns.Study(name="AuthorYearKeyword", path="./test_data")
study.download()
events = study.run()
# Check basic properties
print(f"Events: {len(events)}")
print(f"Subjects: {events['subject'].nunique()}")
print(f"Event types: {events['type'].value_counts()}")
Compute study info¶
Once the data is downloaded and the study runs correctly, compute the
StudyInfo metadata:
from neuralfetch.utils import compute_study_info
info = compute_study_info("AuthorYearKeyword", study.path)
print(info)
# Copy this into _info in your class
Run the test suite¶
cd neuralfetch-repo
pytest neuralfetch/ -x -q -k "AuthorYearKeyword"
The test_study_info test only runs for studies where _info is not None,
so populating _info is what enables test coverage.
4. Open a Pull Request¶
Branch from the current dev branch:
git checkout -b add-authoryearkeyword
Add your file to
neuralfetch-repo/neuralfetch/studies/.Run pre-commit hooks:
pre-commit run --all-files
Push and open a PR. Include in the description:
Link to the dataset and paper.
Number of subjects and event types.
Any known issues or caveats.
Useful References¶
Create Your Own Study tutorial — step-by-step Sphinx Gallery tutorial with runnable code.
neuralset contributing guide — general coding standards, pre-commit hooks, fork/PR workflow.
neuralfetch download helpers — API reference for
download.Openneuro,download.Osf, and all other download backends.