fairseq2.nn.data_parallel

Interfaces

class fairseq2.nn.data_parallel.DataParallelFacade(*args, **kwargs)[source]

Bases: ABC, Stateful

Provides an API-agnostic way to interact with different data parallelism implementations.

DDP, FSDP, and other data parallelism implementations expose different APIs for operations such as state handling and gradient clipping. This interface acts as a facade, providing a consistent way to access these underlying APIs.

abstract state_dict() dict[str, object][source]
abstract load_state_dict(state_dict: dict[str, object]) None[source]
abstract no_sync() AbstractContextManager[None][source]
abstract clip_grad_norm(max_norm: float | None) Tensor[source]
abstract summon_full_parameters() AbstractContextManager[None][source]

Functions

fairseq2.nn.data_parallel.get_data_parallel_facade(module: Module) DataParallelFacade[source]

Returns the data parallel facade associated with the specified module.

If module is of type DDPModule, FSDP1Module, or FSDP2Module, this function will return the corresponding facade, even if one was not previously set.

If the module is not a data parallel module and has no facade, this function will return a no-op implementation.

fairseq2.nn.data_parallel.set_data_parallel_facade(module: Module, facade: DataParallelFacade) None[source]

Associates facade with the specified module.

Raises:

InvalidOperationError – if the module has already a facade associated with it.