license: cdla-permissive-2.0
Model Summary
DAC auto-encoder models provide compact discrete tokenization of speech and audio signals that facilitate signal generation by cascaded generative AI models (e.g. multi-modal generative AI models) and high-quality reconstruction of the original signals. The current models improve upon the original DAC models by allowing a more compact representation for speech-only signals with high-quality signal reconstruction.
Usage
follow DAC installation instructions download the model weights from the current repo (e.g., weights_24khz_1.5kbps_v1.0)
Compress audio
python3 -m dac encode /path/to/input --output /path/to/output/codes --weights_path /path/to/weights_24khz_1.5kbps_v1.0
This command will create .dac
files with the same name as the input files. It will also preserve the directory structure relative to input root and re-create it in the output directory. Please use python -m dac encode --help
for more options.
Reconstruct audio from compressed codes
python3 -m dac decode /path/to/output/codes --output /path/to/reconstructed_input --weights_path /path/to/weights_24khz_1.5kbps_v1.0
This command will create .wav
files with the same name as the input files. It will also preserve the directory structure relative to input root and re-create it in the output directory. Please use python -m dac decode --help
for more options.
Programmatic Usage
import dac
from audiotools import AudioSignal
# Download a model
model_path = /path/to/weights_24khz_1.5kbps_v1.0
model = dac.DAC.load(model_path)
model.to('cuda')
# Load audio signal file
signal = AudioSignal('input.wav')
# Encode audio signal as one long file
# (may run out of GPU memory on long files)
signal.to(model.device)
x = model.preprocess(signal.audio_data, signal.sample_rate)
z, codes, latents, _, _ = model.encode(x)
# Decode audio signal
y = model.decode(z)
# Alternatively, use the `compress` and `decompress` functions
# to compress long files.
signal = signal.cpu()
x = model.compress(signal)
# Save and load to and from disk
x.save("compressed.dac")
x = dac.DACFile.load("compressed.dac")
# Decompress it back to an AudioSignal
y = model.decompress(x)
# Write to file
y.write('output.wav')