Package audiocraft
AudioCraft is a general framework for training audio generative models. At the moment we provide the training code for:
- MusicGen, a state-of-the-art
text-to-music and melody+text autoregressive generative model.
For the solver, see
MusicGenSolver
, and for the model,MusicGen
. - AudioGen, a state-of-the-art text-to-general-audio generative model.
- EnCodec, efficient and high fidelity
neural audio codec which provides an excellent tokenizer for autoregressive language models.
See
CompressionSolver
, andEncodecModel
. - MultiBandDiffusion, alternative diffusion-based decoder compatible with EnCodec that improves the perceived quality and reduces the artifacts coming from adversarial decoders.
- JASCO Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation.
Sub-modules
audiocraft.adversarial
-
Adversarial losses and discriminator architectures.
audiocraft.data
-
Audio loading and writing support. Datasets for raw audio or also including some metadata.
audiocraft.environment
-
Provides cluster and tools configuration across clusters (slurm, dora, utilities).
audiocraft.grids
-
Dora Grids.
audiocraft.losses
-
Loss related classes and functions. In particular the loss balancer from EnCodec, and the usual spectral losses.
audiocraft.metrics
-
Metrics like CLAP score, FAD, KLD, Visqol, Chroma similarity, etc.
audiocraft.models
-
Models for EnCodec, AudioGen, MusicGen, as well as the generic LMModel.
audiocraft.modules
-
Modules used for building the models.
audiocraft.optim
-
Optimization stuff. In particular, optimizers (DAdaptAdam), schedulers and Exponential Moving Average.
audiocraft.quantization
-
RVQ.
audiocraft.solvers
-
Solvers. A Solver is a training recipe, combining the dataloaders, models, optimizer, losses etc into a single convenient object.
audiocraft.train
-
Entry point for dora to launch solvers for running training loops. See more info on how to use dora: https://github.com/facebookresearch/dora
audiocraft.utils
-
Utilities.