Package audiocraft
AudioCraft is a general framework for training audio generative models. At the moment we provide the training code for:
- MusicGen, a state-of-the-art
text-to-music and melody+text autoregressive generative model.
For the solver, see
MusicGenSolver
, and for the model,MusicGen
. - AudioGen, a state-of-the-art text-to-general-audio generative model.
- EnCodec, efficient and high fidelity
neural audio codec which provides an excellent tokenizer for autoregressive language models.
See
CompressionSolver
, andEncodecModel
. - MultiBandDiffusion, alternative diffusion-based decoder compatible with EnCodec that improves the perceived quality and reduces the artifacts coming from adversarial decoders.
Expand source code
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.
"""
AudioCraft is a general framework for training audio generative models.
At the moment we provide the training code for:
- [MusicGen](https://arxiv.org/abs/2306.05284), a state-of-the-art
text-to-music and melody+text autoregressive generative model.
For the solver, see `audiocraft.solvers.musicgen.MusicGenSolver`, and for the model,
`audiocraft.models.musicgen.MusicGen`.
- [AudioGen](https://arxiv.org/abs/2209.15352), a state-of-the-art
text-to-general-audio generative model.
- [EnCodec](https://arxiv.org/abs/2210.13438), efficient and high fidelity
neural audio codec which provides an excellent tokenizer for autoregressive language models.
See `audiocraft.solvers.compression.CompressionSolver`, and `audiocraft.models.encodec.EncodecModel`.
- [MultiBandDiffusion](TODO), alternative diffusion-based decoder compatible with EnCodec that
improves the perceived quality and reduces the artifacts coming from adversarial decoders.
"""
# flake8: noqa
from . import data, modules, models
__version__ = '1.4.0a1'
Sub-modules
audiocraft.adversarial
-
Adversarial losses and discriminator architectures.
audiocraft.data
-
Audio loading and writing support. Datasets for raw audio or also including some metadata.
audiocraft.environment
-
Provides cluster and tools configuration across clusters (slurm, dora, utilities).
audiocraft.grids
-
Dora Grids.
audiocraft.losses
-
Loss related classes and functions. In particular the loss balancer from EnCodec, and the usual spectral losses.
audiocraft.metrics
-
Metrics like CLAP score, FAD, KLD, Visqol, Chroma similarity, etc.
audiocraft.models
-
Models for EnCodec, AudioGen, MusicGen, as well as the generic LMModel.
audiocraft.modules
-
Modules used for building the models.
audiocraft.optim
-
Optimization stuff. In particular, optimizers (DAdaptAdam), schedulers and Exponential Moving Average.
audiocraft.quantization
-
RVQ.
audiocraft.solvers
-
Solvers. A Solver is a training recipe, combining the dataloaders, models, optimizer, losses etc into a single convenient object.
audiocraft.train
-
Entry point for dora to launch solvers for running training loops. See more info on how to use dora: https://github.com/facebookresearch/dora
audiocraft.utils
-
Utilities.