neuralset.extractors.base.LabelEncoder

class neuralset.extractors.base.LabelEncoder(*, event_types: str | tuple[str, ...] = 'Event', aggregation: Literal['single', 'sum', 'mean', 'first', 'middle', 'last', 'cat', 'stack', 'trigger'] = 'single', allow_missing: bool = False, frequency: float = 0.0, event_field: str, treat_missing_as_separate_class: bool = False, return_one_hot: bool = False, predefined_mapping: dict[str, int] | None = None)[source][source]

Encode a given field from an event, e.g. to be used as a label.

Parameters:
  • event_types (str or tuple of str) – Type of event(s) to apply this extractor to.

  • event_field (str) – Field to encode from the event.

  • allow_missing (bool) – If True, allow missing events without raising errors.

  • treat_missing_as_separate_class (bool) – If True, treat missing events as a separate class with index -1, or one-hot vector with last index set to 1. This is only relevant if allow_missing is True. Note: If using LabelEncoder for a multilabel classification task, set this to False for missing labels to be represented by a vector of all zeros.

  • return_one_hot (bool) – If True, return one-hot representation of the index. Otherwise, return an int in [0, n_unique_values - 1] (or the corresponding values provided in predefined_mapping, and -1 for missing events if treat_missing_as_separate_class=True).

  • predefined_mapping (dict, optional) – If provided, use this mapping from label to index instead of computing it from data. Values must be >= 0. If return_one_hot=True, these indices MUST be contiguous and start from 0.