neuralset.extractors.base.BaseExtractor¶

pydantic model neuralset.extractors.base.BaseExtractor[source][source]¶

Base class for extracting features from events within a Segment.

Subclasses define what data to extract (e.g. audio embeddings, EEG signals) while BaseExtractor handles event selection, temporal alignment, and multi-event aggregation.

To create a custom extractor, subclass BaseExtractor and implement:

_get_data(events) — expensive per-event computation (typically cached via exca.MapInfra).
_get_timed_arrays(events, start, duration) — return an iterable of TimedArray, one per event.

For extractors that produce a single static value per event (no time dimension), subclass BaseStatic instead and override get_static().

Parameters:

event_types (str or tuple of str) – Event type name(s) this extractor operates on (e.g. "Audio" or ("Image", "Text")). Must be set as a class-level default in every concrete subclass.
aggregation (str) –
Strategy for combining values when multiple matching events fall inside the same segment:
- "single" — at most one event per output sample (raises on collision).
- "sum" / "mean" — element-wise sum or mean.
- "first" / "middle" / "last" — pick one event.
- "cat" — concatenate along the first dimension.
- "stack" — stack along a new first dimension.
- "trigger" — use only the trigger event passed to __call__.
allow_missing (bool) – If True, return a zero tensor when no matching event is found in the segment instead of raising. Requires that prepare() has been called first so the output shape is known.
frequency (float or "native") – Output sampling rate in Hz. Use "native" to keep the original sampling rate of the input data. 0 is reserved for static extractors (BaseStatic).

Fields:

aggregation (Literal['single', 'sum', 'mean', 'first', 'middle', 'last', 'cat', 'stack', 'trigger'])
allow_missing (bool)
event_types (str | tuple[str, ...])
frequency (float | Literal['native'])

field event_types: str | tuple[str, ...] = ''[source]¶

field aggregation: Literal['single', 'sum', 'mean', 'first', 'middle', 'last', 'cat', 'stack', 'trigger'] = 'single'[source]¶

field allow_missing: bool = False[source]¶

field frequency: float | Literal['native'] = 0.0[source]¶

prepare(obj: DataFrame | Sequence[Event] | Sequence[Segment]) → None[source][source]¶

Pre-compute and cache extractor data for a collection of events.

This method triggers _get_data on every matching event so that expensive computation (e.g. model inference) is done once and cached. It then calls the extractor on a single event to populate the output shape, which is needed when allow_missing=True.

Call prepare before using the extractor in a dataloader.

Parameters:: obj (DataFrame or sequence of Event or sequence of Segment) – The structure containing the events. When calling prepare on several objects, prefer passing a list of events or segments over a DataFrame to avoid redundant conversion overhead.

requirements: tp.ClassVar[tuple[str, ...]] = ()[source]¶

← Back to API reference