neuralset.extractors.neuro.SpikesExtractor

pydantic model neuralset.extractors.neuro.SpikesExtractor[source][source]

Feature extractor for spike data stored in HDF5/NWB files.

Reads spike times from HDF5 files and creates a dense binned array of shape (n_units, n_time_bins) at the specified frequency.

The preprocessing steps, if specified, are ordered as follows: 1. Spike binning at target frequency 2. Scaling 3. Baseline correction (applied on segments) 4. Clamp (applied on segments)

Parameters:
  • frequency ("native" or float, default="native") – Target sampling frequency for spike binning. If "native", uses the frequency declared in the Spikes event.

  • offset (float, default=0.0) – Time offset (in seconds) applied to the segment window.

  • baseline (tuple of float, optional) – If provided as (start, end), defines the baseline correction window in seconds relative to the segment start.

  • scaler ({"RobustScaler", "StandardScaler"}, optional) – Scaling strategy to normalize channel data using scikit-learn scalers.

  • scale_factor (float, optional) – Multiplicative factor applied to the data after scaling but before clamping.

  • clamp (float, optional) – Maximum absolute value for clamping after preprocessing.

  • channel_order ({"unique", "original"}, default="unique") – "unique": channels are numbered based on unique names across all recordings. "original": channel indices follow per-recording order, enabling a fixed-size channel dimension across subjects.

Fields:
field event_types: Literal['Spikes'] = 'Spikes'[source]
field frequency: Literal['native'] | float = 'native'[source]
field offset: float = 0.0[source]
field baseline: tuple[float, float] | None = None[source]
field scaler: None | Literal['RobustScaler', 'StandardScaler'] = None[source]
field scale_factor: float | None = None[source]
field clamp: float | None = None[source]
field channel_order: Literal['unique', 'original'] = 'unique'[source]
requirements: ClassVar[Any] = ('h5py',)[source]
field infra: MapInfra = MapInfra(folder=None, cluster=None, logs='{folder}/logs/{user}/%j', job_name=None, timeout_min=120, nodes=1, tasks_per_node=1, cpus_per_task=10, gpus_per_node=None, mem_gb=None, max_pickle_size_gb=None, slurm_constraint=None, slurm_partition=None, slurm_account=None, slurm_qos=None, slurm_use_srun=False, slurm_additional_parameters=None, slurm_setup=None, conda_env=None, workdir=None, permissions=511, version='1', keep_in_ram=True, max_jobs=128, min_samples_per_job=1, forbid_single_item_computation=False, mode='cached')[source]
prepare(obj: DataFrame | Sequence[Event] | Sequence[Segment]) None[source][source]

Pre-compute and cache extractor data for a collection of events.

This method triggers _get_data on every matching event so that expensive computation (e.g. model inference) is done once and cached. It then calls the extractor on a single event to populate the output shape, which is needed when allow_missing=True.

Call prepare before using the extractor in a dataloader.

Parameters:

obj (DataFrame or sequence of Event or sequence of Segment) – The structure containing the events. When calling prepare on several objects, prefer passing a list of events or segments over a DataFrame to avoid redundant conversion overhead.