neuralset.events.transforms.text.AddSentenceToWords

pydantic model neuralset.events.transforms.text.AddSentenceToWords[source][source]

Adds sentence-level information to word events based on Text rows.

This transform processes a DataFrame containing word-level (Word) and text-level (Text) events. For each sentence found in the Text rows, it:

  1. Creates a new Sentence row for each sentence.

  2. Assigns sentence and sentence_char annotations to Word rows to indicate which sentence each word belongs to, and which character the word starts at in the sentence.

Parameters:
  • max_unmatched_ratio (float) – Maximum allowed ratio of word rows that do not match any sentence. Raises an error if this ratio is exceeded.

  • override_sentences (bool, default=False) – Whether to replace existing Sentence rows if they are already present.

Fields:
field max_unmatched_ratio: float = 0.0[source]
field override_sentences: bool = False[source]
requirements: tp.ClassVar[tuple[str, ...]] = ()[source]