neuralset.extractors.base.LabelEncoder¶
- class neuralset.extractors.base.LabelEncoder(*, event_types: str | tuple[str, ...] = 'Event', aggregation: Literal['single', 'sum', 'mean', 'first', 'middle', 'last', 'cat', 'stack', 'trigger'] = 'single', allow_missing: bool = False, frequency: float = 0.0, event_field: str, treat_missing_as_separate_class: bool = False, return_one_hot: bool = False, predefined_mapping: dict[str, int] | None = None)[source][source]¶
Encode a given field from an event, e.g. to be used as a label.
- Parameters:
event_types (str or tuple of str) – Type of event(s) to apply this extractor to.
event_field (str) – Field to encode from the event.
allow_missing (bool) – If True, allow missing events without raising errors.
treat_missing_as_separate_class (bool) – If True, treat missing events as a separate class with index -1, or one-hot vector with last index set to 1. This is only relevant if allow_missing is True. Note: If using LabelEncoder for a multilabel classification task, set this to False for missing labels to be represented by a vector of all zeros.
return_one_hot (bool) – If True, return one-hot representation of the index. Otherwise, return an int in [0, n_unique_values - 1] (or the corresponding values provided in
predefined_mapping, and-1for missing events iftreat_missing_as_separate_class=True).predefined_mapping (dict, optional) – If provided, use this mapping from label to index instead of computing it from data. Values must be >= 0. If
return_one_hot=True, these indices MUST be contiguous and start from 0.