NeuCodec decoder fine-tuned for German speech

This is just the decoder of neuphonic/neucodec, fine-tuned on equal amounts of German and English speech data from Emilia-Yodas, to enhance decoding quality of German speech. Since we only fine-tuned the decoder, the codebook is identical to the base model, meaning this model can be used with the regular NeuCodec encoder.

We supply a compact class NeuCodecDecoder.py to easily run inference with this decoder since the NeuCodec codebase doesn't easily allow loading model files from foreign HuggingFace repos.

Inference Example

import torch
import torchaudio

from NeuCodecDecoder import NeuCodecDecoder

decoder_model = NeuCodecDecoder.from_pretrained("DigitalLearningGmbH/neucodec-decoder-ft-de")
decoder_model = decoder_model.eval().cuda()

with torch.no_grad():
    decoded = decoder_model.decode_code(torch.tensor(tokens).unsqueeze(0).unsqueeze(0).to('cuda')).cpu()

torchaudio.save("decoded.wav", decoded[0, :, :], 24_000)

For more information please refer to the original model card.

Downloads last month
66
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DigitalLearningGmbH/neucodec-decoder-ft-de

Base model

neuphonic/neucodec
Finetuned
(1)
this model

Dataset used to train DigitalLearningGmbH/neucodec-decoder-ft-de