neuralset.extractors.video.HuggingFaceVideo

pydantic model neuralset.extractors.video.HuggingFaceVideo[source][source]

Extract video embeddings using a native HuggingFace video model.

Videos are divided into clips of clip_duration seconds at the specified frequency. Each clip is processed by the video model, and features are aggregated over layers/tokens using the HuggingFace extractor options.

Parameters:
  • model_name (str, default="MCG-NJU/videomae-base") – HuggingFace video model identifier. Image models are not accepted here; use HuggingFaceImage for frame-by-frame video embeddings.

  • clip_duration (float | None, default=None) – Duration (in seconds) of video sub-clips to process. If None, defaults to one timestep (1 / frequency).

  • max_imsize (int | None, default=None) – Maximum image dimension for downsampling before processing.

  • num_frames (int) – Number of frames to pass to the video model per clip.

Fields:
field event_types: Literal['Video'] = 'Video'[source]
SUPPORTED_MODELS: ClassVar[tuple[str, ...]] = ('vjepa2', 'videomae', 'google/vivit', 'facebook/timesformer')[source]
requirements: ClassVar[tuple[str, ...]] = ('transformers>=4.29.2', 'huggingface_hub>=0.27.0', 'transformers>=4.29.2', 'huggingface_hub>=0.27.0', 'torchvision>=0.15.2', 'julius>=0.2.7', 'moviepy>=2.1.2')[source]
field model_name: str = 'MCG-NJU/videomae-base'[source]
field hf_config: HuggingFaceVideoConfig = HuggingFaceVideoConfig(** { 'model_cls_name': 'AutoModel',   'model_kwargs': None,   'processor_cls_name': 'AutoProcessor',   'processor_kwargs': {'do_rescale': True}} )[source]
field clip_duration: float | None = None[source]
field max_imsize: int | None = None[source]
field num_frames: int [Required][source]
field infra: MapInfra = MapInfra(folder=None, cluster=None, logs='{folder}/logs/{user}/%j', job_name=None, timeout_min=120, nodes=1, tasks_per_node=1, cpus_per_task=8, gpus_per_node=1, mem_gb=None, max_pickle_size_gb=None, slurm_constraint=None, slurm_partition=None, slurm_account=None, slurm_qos=None, slurm_use_srun=False, slurm_additional_parameters=None, slurm_setup=None, conda_env=None, workdir=None, permissions=511, version='v5', keep_in_ram=True, max_jobs=128, min_samples_per_job=128, forbid_single_item_computation=False, mode='cached')[source]