neuralbench.modules.DownstreamWrapper¶
- pydantic model neuralbench.modules.DownstreamWrapper[source][source]¶
Configuration for wrapping a (pretrained) model for downstream fine-tuning or linear probing.
This class provides a declarative way to configure how a pretrained model should be adapted for downstream tasks, including optional on-the-fly preprocessing, layer freezing, output aggregation, and adding a trainable probe on top of the model.
- Parameters:
on_the_fly_preprocessor (OnTheFlyPreprocessor | None, optional) – On-the-fly preprocessing applied to the input before the model forward pass. Typically model-specific (e.g. QuantileAbsScaler for BIOT). Default is None.
channel_adapter_config (ChannelMerger | ChannelProjection | None, optional) – Configuration for a channel adapter that projects from arbitrary input channels to a fixed number of target channels. Supply a
ChannelMergerfor position-based spatial attention, or aChannelProjectionfor a simple Conv1d(kernel_size=1) linear mixing. Default is None.model_output_key (str | int | None, optional) – Key or index to extract from model output dictionary. If None, assumes the model returns a tensor directly. Default is None.
layers_to_freeze (list[str] | None, optional) – List of layer name patterns to freeze (set requires_grad=False). Cannot be used together with layers_to_unfreeze. Default is None.
layers_to_unfreeze (list[str] | tp.Literal["last"] | None, optional) – List of layer name patterns to unfreeze (set requires_grad=True), while freezing all others. Cannot be used together with layers_to_freeze. If “last”, unfreezes the last layer (nn.Module) of the model. Default is None.
strict_matching (bool, optional) – If True, when freezing/unfreezing layers, only the first part of the layer name (before the first dot) must match exactly. If False, any part of the layer name can match the patterns. Default is True.
aggregation ({"flatten", "mean", "first"} or int, optional) – Method to aggregate the model output.
"flatten"flattens all dimensions except batch;"mean"averages over the temporal/sequence dimension (dim=1);"first"selects only the first timestep/token; anintsplits into n groups, averages each group, then concatenates;Noneperforms no aggregation.probe_config (Mlp | "linear" | None, optional) – Configuration for the probe layer added on top.
Noneuses identity (no additional layer), e.g. if the model already has a linear layer of the right output size."linear"adds a single linear layer. AnMlpinstance adds a multi-layer perceptron with specified configuration.
- Fields:
- field on_the_fly_preprocessor: OnTheFlyPreprocessor | None = None[source]¶
- field channel_adapter_config: ChannelMerger | ChannelProjection | None = None[source]¶