neuralbench.main.BenchmarkAggregator

pydantic model neuralbench.main.BenchmarkAggregator[source][source]

Orchestrate multiple Experiment runs and visualise results.

Experiments are submitted (possibly via Slurm) with prepare(), collected with _collect_results(), and plotted/tabled via the functions in neuralbench.plots.benchmark.

Fields:
  • collect_max_workers (int)

  • debug (bool)

  • experiments (list['Experiment'])

  • loss_to_metric_mapping (dict[str, str])

  • max_workers (int)

  • output_dir (str)

field experiments: list['Experiment'] [Required][source]
field max_workers: int = 256[source]
field collect_max_workers: int = 32[source]
field debug: bool = False[source]
field output_dir: str [Optional][source]
field loss_to_metric_mapping: dict[str, str] = {'BCEWithLogitsLoss': 'test/f1_score_macro', 'ClipLoss': 'test/full_retrieval/top5_acc_subject-agg', 'CrossEntropyLoss': 'test/bal_acc', 'MSELoss': 'test/pearsonr'}[source]
prepare() None[source][source]
run(cached_only: bool = False) list[dict[str, Any]][source][source]