neuralbench.cli.run_benchmark¶
- neuralbench.cli.run_benchmark(device: str, task: str | list[str], *, model: str | list[str] | None = None, dataset: str | list[str] | None = None, checkpoint: str | None = None, downstream_wrapper: str | list[str] | None = None, grid: bool = False, debug: bool = False, force: bool = False, retry: bool = False, prepare: bool = False, download: bool = False, plot_cached: bool = False) list[dict[str, Any]][source][source]¶
Run one or more NeuralBench experiments from Python.
This is the programmatic equivalent of the
neuralbenchCLI. It assembles experiment configs from the same YAML files and returns test-metric dictionaries when running in debug mode.- Parameters:
device (str) – Brain recording device (
"eeg","meg","fmri", …).task (str or list of str) – Task name(s),
"all", or"all_multi_dataset".model (str or list of str or None) – Predefined model name(s),
"all","all_classic","all_fm","all_baseline"(chance / dummy / classical sklearn pipelines), orNone(uses default model fromconfig.yaml).dataset (str or list of str or None) – Dataset variant(s) or
"all".Noneuses the base config.checkpoint (str or None) – Path to a model checkpoint to reload.
downstream_wrapper (str or list of str or None) – Downstream wrapper name(s) or
"all".grid (bool) – Expand the task-specific hyperparameter grid.
debug (bool) – Run locally with a reduced config (2 epochs, 5 batches).
force (bool) – Force re-running experiments.
retry (bool) – Retry failed experiments while keeping completed results.
prepare (bool) – Run a single experiment to warm the preprocessing cache.
download (bool) – Only download the dataset; do not run experiments.
plot_cached (bool) – Generate plots and tables from cached results only, without running any new experiments.
- Returns:
One result dict per experiment (empty when experiments are submitted asynchronously via Slurm).
- Return type: