neuralset.extractors.base.LabelEncoder¶

pydantic model neuralset.extractors.base.LabelEncoder[source][source]¶

Encode a given field from an event, e.g. to be used as a label.

Parameters:

event_types (str or tuple of str) – Type of event(s) to apply this extractor to.
event_field (str) – Field to encode from the event.
allow_missing (bool) – If True, allow missing events without raising errors.
treat_missing_as_separate_class (bool) – If True, treat missing events as a separate class with index -1, or one-hot vector with last index set to 1. This is only relevant if allow_missing is True. Note: If using LabelEncoder for a multilabel classification task, set this to False for missing labels to be represented by a vector of all zeros.
return_one_hot (bool) – If True, return one-hot representation of the index. Otherwise, return an int in [0, n_unique_values - 1] (or the corresponding values provided in predefined_mapping, and -1 for missing events if treat_missing_as_separate_class=True).
predefined_mapping (dict, optional) – If provided, use this mapping from label to index instead of computing it from data. Values must be >= 0. If return_one_hot=True, these indices MUST be contiguous and start from 0.

Fields:

predefined_mapping (dict[str, int] | None)
return_one_hot (bool)
treat_missing_as_separate_class (bool)

field treat_missing_as_separate_class: bool = False[source]¶

field return_one_hot: bool = False[source]¶

field predefined_mapping: dict[str, int] | None = None[source]¶

prepare(obj: DataFrame | Sequence[Event] | Sequence[Segment]) → None[source][source]¶

Pre-compute and cache extractor data for a collection of events.

This method triggers _get_data on every matching event so that expensive computation (e.g. model inference) is done once and cached. It then calls the extractor on a single event to populate the output shape, which is needed when allow_missing=True.

Call prepare before using the extractor in a dataloader.

Parameters:: obj (DataFrame or sequence of Event or sequence of Segment) – The structure containing the events. When calling prepare on several objects, prefer passing a list of events or segments over a DataFrame to avoid redundant conversion overhead.

get_static(event: Event) → Tensor[source][source]¶

Return a single feature vector for the given event.

Override this method in subclasses to produce a static (non-temporal) embedding for one event. The returned tensor should have no time dimension — temporal wrapping is handled by BaseStatic automatically.

Parameters:: event (Event) – The event to extract a feature from.
Returns:: A tensor of shape (*feature_shape,) (no time axis).
Return type:: torch.Tensor

requirements: tp.ClassVar[tuple[str, ...]] = ()[source]¶

← Back to API reference