NeuralSet

NeuralSet turns raw neural recordings and stimuli into PyTorch-ready datasets.

Quick install

pip install neuralset

Heavier dependencies (e.g. transformers for text/image/etc feature extraction) can be pre-installed with:

pip install neuralset[all]

see Installation for the full breakdown.


Quick start

Pick an example to see the code. Copy the setup commands into your terminal first, then run the Python snippet. The first time you run it, it will be slow (data downloading, cache preparation, etc.) — but then lightning fast, even as you change parameters (e.g. segment duration).

Setup (run once in your terminal)
Load study data, configure extractors & segment

Tutorials

Each tutorial walks through one building block of the NeuralSet pipeline.

Events
The universal data format — every recording, stimulus, and annotation is an event.
from neuralset.events import Event
evt = Event(type="Word", start=1.0,
            duration=0.3, timeline="sub-01")
print(evt)              # pydantic model
print(evt.model_dump()) # dict
Studies
Download datasets and load them as events DataFrames.
study = ns.Study(name="Fake2025Meg",
                 path=ns.CACHE_FOLDER)
events = study.run()
print(f"{len(events)} events, "
      f"{events['subject'].nunique()} subjects")
Transforms
Filter, split, and enrich events before extraction.
import neuralset as ns
transform = ns.events.transforms.AddSentenceToWords()
events = transform(events)
print(events[events.type == "Sentence"].head())
Extractors
Turn events into tensors — brain signals, text embeddings, images, labels.
meg = ns.extractors.MegExtractor(frequency=100.0)
sample = meg(events, start=0.0, duration=1.0)
print(f"MEG shape: {sample.shape}")
Segmenter & Dataset
Create time-locked segments and iterate with a PyTorch DataLoader.
segmenter = ns.dataloader.Segmenter(
    start=-0.1, duration=0.5,
    trigger_query='type=="Word"',
    extractors=dict(meg=meg),
    drop_incomplete=True)
dataset = segmenter.apply(events)
loader = DataLoader(dataset, batch_size=8,
                    collate_fn=dataset.collate_fn)
Chains
Compose Study + Transforms into reproducible, cacheable pipelines.
chain = ns.Chain(steps=[
    ns.Study(name="Fake2025Meg",
             path=ns.CACHE_FOLDER),
    transforms.AddSentenceToWords(),
])
events = chain.run()

Citation

@misc{king2026neuralset,
  title  = {NeuralSet: A High-Performing Python Package for Neuro-AI},
  author = {King, Jean-R{\'e}mi and Bel, Corentin and Evanson, Linnea
            and Gadonneix, Julien and Houhamdi, Sophia and L{\'e}vy, Jarod
            and Raugel, Josephine and Santos Revilla, Andrea
            and Zhang, Mingfang and Bonnaire, Julie and Caucheteux, Charlotte
            and D{\'e}fossez, Alexandre and Desbordes, Th{\'e}o
            and Diego-Sim{\'o}n, Pablo and Khanna, Shubh and Millet, Juliette
            and Orhan, Pierre and Panchavati, Saarang and Ratouchniak, Antoine
            and Thual, Alexis and Brooks, Teon L. and Begany, Katelyn
            and Benchetrit, Yohann and Careil, Marl{\`e}ne and Banville, Hubert
            and d'Ascoli, St{\'e}phane and Dahan, Simon and Rapin, J{\'e}r{\'e}my},
  year   = {2026},
  url    = {https://kingjr.github.io/files/neuralset.pdf},
  note   = {Preprint; URL will be updated when the paper lands on arXiv}
}