neuralbench.transforms.TextPreprocessor¶
- pydantic model neuralbench.transforms.TextPreprocessor[source][source]¶
Clean and filter text-related events.
The following operations are applied to the events:
Keep only events with duration >= 0
Keep only neuro events, Audio, or valid Word events (with text as string)
Clean ‘text’ column by removing special characters and lowercasing
Drop empty or blank text entries
For Nieuwland2018, group similar sentences together to avoid leakage (each sentence has two very similar versions)
- Fields: