SongGeneration
Demo | Paper | Code | Space Demo
This repository is the official weight repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment. In this repository, we provide the SongGeneration model, inference scripts, and the checkpoint that has been trained on the Million Song Dataset.
Model Versions
Model | Max Length | Language | GPU Menmory | RFT(A100) | Download Link |
---|---|---|---|---|---|
SongGeneration-base | 2m30s | zh | 10G/16G | 1.26 | You were here |
SongGeneration-base-new | 2m30s | zh, en | 10G/16G | 1.26 | Huggingface |
SongGeneration-base-full | 4m30s | zh, en | 12G/18G | 1.30 | Huggingface |
SongGeneration-large | 4m30s | zh, en | 22G/28G | 1.51 | Huggingface |
SongGeneration-v1.5-small | 2m | zh, en, es, ja, etc. | - | - | Coming soon |
SongGeneration-v1.5-base | 4m30s | zh, en, es, ja, etc. | - | - | Coming soon |
SongGeneration-v1.5-large | 4m30s | zh, en, es, ja, etc. | - | - | Coming soon |
Overview
We develop the SongGeneration model. It is an LM-based framework consisting of LeLM and a music codec. LeLM is capable of parallelly modeling two types of tokens: mixed tokens, which represent the combined audio of vocals and accompaniment to achieve vocal-instrument harmony, and dual-track tokens, which separately encode vocals and accompaniment for high-quality song generation. The music codec reconstructs the dual-track tokens into highfidelity music audio. SongGeneration significantly improves over the open-source music generation models and performs competitively with current state-of-the-art industry systems. For more details, please refer to our paper.

License
The code and weights in this repository is released in the LICENSE file.
- Downloads last month
- 1,049