One of the core goals of fairseq2 is to make it possible for researchers to
explore new ideas and implement novel features without having to fork fairseq2.
Instead of having a monolithic repository that can only be modified by
copy-pasting large chunks of code, in fairseq2, all major APIs follow the
interface/implementation convention along with the dependency inversion principle.
This means, each API has an interface (i.e. an abstract ABC
class) that defines the contract of that API, and one or more concrete
implementations of that interface. Different implementations can be integrated
with the rest of fairseq2 via its powerful dependency injection framework.
These diagrams demonstrate fairseq2’s interface-first approach: each API starts with
a clear abstract interface, followed by multiple concrete implementations that can
be used interchangeably.
The dependency inversion principle is critical to have a clean, well-tested, and
extensible API. The example below shows the (abbreviated) __init__() method
of the StandardTransformerDecoderLayer:
Instead of constructing the multihead attention and feed-forward network layers
within its __init__() method, StandardTransformerDecoderLayer
expects the caller to provide instances of MultiheadAttention and
FeedForwardNetwork interfaces. This loose-coupling between an instance
and its dependencies enables composing diverse object graphs, such as different
model architectures, with minimal redundancy (i.e. code duplication).
fairseq2 v0.5 introduces a dependency injection framework that
significantly simplifies the construction and management of complex object graphs.
The core components are the DependencyContainer and
DependencyResolver classes, which provide automatic dependency resolution,
singleton management, and collection handling.
The fairseq2 library uses the dependency injection system extensively for all
core components. The global container is initialized by fairseq2.init_fairseq2()
and can be accessed through get_dependency_resolver():
importfairseq2fromfairseq2.runtime.dependencyimportget_dependency_resolverfromfairseq2.assetsimportAssetStorefromfairseq2.deviceimportDevicefromfairseq2.modelsimportload_model# Initialize the library - sets up the global containerfairseq2.init_fairseq2()# Access the global resolverresolver=get_dependency_resolver()# Resolve library componentsasset_store=resolver.resolve(AssetStore)device=resolver.resolve(Device)card=asset_store.retrieve_card("llama3_1_8b_instruct")# These are all automatically configured through the DI systemprint(f"Default device: {device}")print(f"Retrieved card: {card}")
The diagram below illustrates how fairseq2’s dependency injection system orchestrates
recipe execution, from initial composition through to task execution:
flowchart TD
%% Styling
classDef containerBox fill:#e1f5fe,stroke:#0288d1,stroke-width:2px,color:#01579b
classDef registryBox fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#4a148c
classDef executionBox fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#1b5e20
classDef componentBox fill:#fff3e0,stroke:#f57c00,stroke-width:1px,color:#e65100
%% Step 0: Composition Phase
subgraph S0["🔧 Composition Phase"]
direction TB
subgraph Inputs["Core Dependencies"]
direction TB
WI[World Info]
DV[Device]
ENV[Environment]
FS[File System]
TH[Thread Pool]
RNG[RNG Bag]
componentBox:::componentBox
end
subgraph Registry["Extension Registry"]
direction TB
AS[Asset Store]
MF[Model Families]
TF[Tokenizer Families]
EX[Extensions]
CH[Checkpoint Handlers]
registryBox:::registryBox
end
subgraph Recipe["Recipe Components"]
direction TB
RC[Recipe Config]
OD[Output Directory]
TR[Task Runner]
executionBox:::executionBox
end
Inputs --> LR[Library Registration<br/>_register_library]
Registry --> LR
LR --> RR[Recipe Registration<br/>_register_*_recipe]
Recipe --> RUR[Run Registration<br/>_register_run]
RR --> RUR
end
%% Dependency Container
RUR --> DC[📦 Dependency Container<br/>Auto-wiring & Resolution]
DC:::containerBox
%% Step 1: Execution Phase
DC --> S1
subgraph S1["⚡ Execution Phase (_run_recipe)"]
direction TB
CP[Cluster Preparer<br/>🏗️ Environment Setup]
LC[Log Configurer<br/>📋 Distributed Logging]
CD[Config Dumper<br/>💾 Save Configuration]
TC[Torch Configurer<br/>🔥 PyTorch Setup]
LH[Log Helper<br/>📊 System Info Logging]
TR2[Task Runner<br/>🚀 Execute Recipe Task]
CP --> LC --> CD --> TC --> LH --> TR2
executionBox:::executionBox
end
%% Resolution arrows
DC -.->|"resolve(ClusterPreparer)"| CP
DC -.->|"resolve(LogConfigurer)"| LC
DC -.->|"resolve(ConfigDumper)"| CD
DC -.->|"resolve(TorchConfigurer)"| TC
DC -.->|"resolve(LogHelper)"| LH
DC -.->|"resolve(TaskRunner)"| TR2
This flow demonstrates several key concepts:
1. Composition Phase - All components are registered with the container: