liu-hanghang commited on
Commit
786e957
·
verified ·
1 Parent(s): 8c6ebd9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -1,3 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
2
 
3
  JAM is a rectified flow-based model for lyrics-to-song generation that addresses the lack of fine-grained word-level controllability in existing lyrics-to-song models. Built on a compact 530M-parameter architecture with 16 LLaMA-style Transformer layers as the Diffusion Transformer (DiT) backbone, JAM enables precise vocal control that musicians desire in their workflows. Unlike previous models, JAM provides word and phoneme-level timing control, allowing musicians to specify the exact placement of each vocal sound for improved rhythmic flexibility and expressive timing.
@@ -267,4 +283,4 @@ For questions, concerns, or collaboration inquiries, please contact the Project
267
  For issues and questions:
268
  - Open an issue on GitHub
269
  - Check the troubleshooting section above
270
- - Review the configuration options for parameter tuning
 
1
+ ---
2
+ language:
3
+ - en
4
+ metrics:
5
+ - PER
6
+ - WER
7
+ - SongEval
8
+ - Audio Aesthetics
9
+ - MuQ
10
+ - FAD
11
+ pipeline_tag: text-to-audio
12
+ library_name: diffusers
13
+ tags:
14
+ - music
15
+ - art
16
+ ---
17
  # JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
18
 
19
  JAM is a rectified flow-based model for lyrics-to-song generation that addresses the lack of fine-grained word-level controllability in existing lyrics-to-song models. Built on a compact 530M-parameter architecture with 16 LLaMA-style Transformer layers as the Diffusion Transformer (DiT) backbone, JAM enables precise vocal control that musicians desire in their workflows. Unlike previous models, JAM provides word and phoneme-level timing control, allowing musicians to specify the exact placement of each vocal sound for improved rhythmic flexibility and expressive timing.
 
283
  For issues and questions:
284
  - Open an issue on GitHub
285
  - Check the troubleshooting section above
286
+ - Review the configuration options for parameter tuning