neuralset.dataloader.SegmentDataset

class neuralset.dataloader.SegmentDataset(extractors: Mapping[str, BaseExtractor], segments: Sequence[Segment], *, remove_incomplete_segments: bool = False, pad_duration: float | Literal['auto'] | None = None, transforms: dict[str, Callable] | None = None)[source][source]

Dataset defined through Segment instances and BaseExtractor instances.

Parameters:
  • extractors (dict of BaseExtractor) – extractors to be computed, returned in the Batch.data dictionary items

  • segments (list of Segment) – the list of segment instances defining the dataset

  • pad_duration (float | tp.Literal["auto"] | None) –

    pad the segments to the maximum duration or to a specific duration

    None: no padding. Will throw error if segment durations vary. “auto”: will pad with the max(segments.duration)

  • remove_incomplete_segments (bool) – remove segments which do not contain events for one of the extractors

  • transforms (dict, optional) – Map of extractor names to transforms (callables transforming the extractor tensor). If an extractor name is not present, no transform is applied. Keys must be a subset of the extractor names.

  • Usage

  • -----

  • code-block: (..) – python: extractors = {“whatever”: ns.extractors.Pulse()} ds = ns.SegmentDataset(extractors, segments) # one data item item = ds[0] assert item.data[“whatever”].shape[0] == 1 # batch dimension is always added # through dataloader: dataloader = torch.utils.data.DataLoader(ds, collate_fn=ds.collate_fn, batch_size=2) batch = next(iter(dataloader)) print(batch.data[“whatever”]) # batch.segments holds the corresponding segments

as_one_batch(num_workers: int = 0) Batch[source][source]

Deprecated: use load_all() instead.

build_dataloader(**kwargs: Any) DataLoader[source][source]

Returns a dataloader for this dataset

collate_fn(batches: list[Batch]) Batch[source][source]

Creates a new instance from several by stacking in a new first dimension for all attributes

load_all(num_workers: int = 0) Batch[source][source]

Returns a single batch with all the dataset data, un-shuffled