Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
|
2 |
|
3 |
JAM is a rectified flow-based model for lyrics-to-song generation that addresses the lack of fine-grained word-level controllability in existing lyrics-to-song models. Built on a compact 530M-parameter architecture with 16 LLaMA-style Transformer layers as the Diffusion Transformer (DiT) backbone, JAM enables precise vocal control that musicians desire in their workflows. Unlike previous models, JAM provides word and phoneme-level timing control, allowing musicians to specify the exact placement of each vocal sound for improved rhythmic flexibility and expressive timing.
|
@@ -267,4 +283,4 @@ For questions, concerns, or collaboration inquiries, please contact the Project
|
|
267 |
For issues and questions:
|
268 |
- Open an issue on GitHub
|
269 |
- Check the troubleshooting section above
|
270 |
-
- Review the configuration options for parameter tuning
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
metrics:
|
5 |
+
- PER
|
6 |
+
- WER
|
7 |
+
- SongEval
|
8 |
+
- Audio Aesthetics
|
9 |
+
- MuQ
|
10 |
+
- FAD
|
11 |
+
pipeline_tag: text-to-audio
|
12 |
+
library_name: diffusers
|
13 |
+
tags:
|
14 |
+
- music
|
15 |
+
- art
|
16 |
+
---
|
17 |
# JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
|
18 |
|
19 |
JAM is a rectified flow-based model for lyrics-to-song generation that addresses the lack of fine-grained word-level controllability in existing lyrics-to-song models. Built on a compact 530M-parameter architecture with 16 LLaMA-style Transformer layers as the Diffusion Transformer (DiT) backbone, JAM enables precise vocal control that musicians desire in their workflows. Unlike previous models, JAM provides word and phoneme-level timing control, allowing musicians to specify the exact placement of each vocal sound for improved rhythmic flexibility and expressive timing.
|
|
|
283 |
For issues and questions:
|
284 |
- Open an issue on GitHub
|
285 |
- Check the troubleshooting section above
|
286 |
+
- Review the configuration options for parameter tuning
|