neuralset.dataloader.Segmenter¶
- class neuralset.dataloader.Segmenter(*, start: float = 0.0, duration: Annotated[float, Gt(gt=0)] | None, trigger_query: Annotated[str, AfterValidator(func=validate_query)] | None = None, stride: Annotated[float, Gt(gt=0)] | None = None, stride_drop_incomplete: bool = True, extractors: dict[str, BaseExtractor], padding: float | None = None, drop_incomplete: bool = False, drop_unused_events: bool = True)[source][source]¶
Build a
SegmentDatasetfrom an events DataFrame and extractors.- Parameters:
extractors (dict of
BaseExtractor) – extractors to be computed, returned in the Batch.data dictionary itemsstart (float) – Start time (in seconds) of the segment, with respect to the trigger event (or stride). E.g. use -1.0 if you want the segment to start 1s before the event.
duration (optional float) – Duration (in seconds) of the segment (defaults to event duration if
trigger_queryis used to extract segments based on specific events).trigger_query (optional Query) – Dataframe query selecting which events act as triggers — segments are time-locked to the matching events (see
base.Query). At least one oftrigger_queryorstridemust be provided.stride (optional float) – Stride (in seconds) to use to define sliding window segments.
stride_drop_incomplete (optional bool) – If True and stride is not None, drop segments that are not fully contained within the (start, stop) block.
padding (optional float | tp.Literal["auto"] | None) –
- pad the segments to the maximum duration or to a specific duration.
None: no padding. Will throw error if segment durations vary. “auto”: will pad with the max(segments.duration)
drop_incomplete (bool) – remove segments which do not contain events for one of the extractors
drop_unused_events (bool) – remove events not used by the extractors before creating the segments.
Usage
-----
code-block: (..) – python: extractors = {“whatever”: ns.extractors.Pulse()} segmenter = ns.Segmenter(extractors=extractors) dset = segmenter.apply(events) # one data item item = dset[0] assert item.data[“whatever”].shape[0] == 1 # batch dimension is always added # through dataloader: dataloader = torch.utils.data.DataLoader(dset, collate_fn=dset.collate_fn, batch_size=2) batch = next(iter(dataloader)) print(batch.data[“whatever”]) # batch.segments holds the corresponding segments