k2-fsa
/

ZipVoice

@@ -13,9 +13,15 @@ tags:
 # ZipVoice⚡: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching</center>
-This model is a checkpoint for **ZipVoice-Dialog**, a non-autoregressive zero-shot spoken dialogue generation model, as presented in [ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching](https://huggingface.co/papers/2507.09318).
-You can also find the project/demo page here: [https://zipvoice-dialog.github.io](https://zipvoice-dialog.github.io)
 ## 1. Explanation of each directory
@@ -29,12 +35,8 @@ You can also find the project/demo page here: [https://zipvoice-dialog.github.io
 | zipvoice_dialog_opendialog     | ZipVoice-Dialog           | OpenDialog                        | zipvoice/model.pt          |
 | zipvoice_dialog_stereo         | ZipVoice-Dialog-Stereo    | in-house dataset                  | zipvoice_dialog/model.pt   |
-## 2. Github
-See our Github repository [ZipVoice](https://github.com/k2-fsa/ZipVoice) for details
-## 3. Discussion & Communication
 You can directly discuss on [Github Issues](https://github.com/k2-fsa/ZipVoice/issues).
@@ -44,7 +46,7 @@ You can also scan the QR code to join our wechat group or follow our wechat offi
 | ------------ | ----------------------- |
 |![wechat](https://k2-fsa.org/zh-CN/assets/pic/wechat_group.jpg) |![wechat](https://k2-fsa.org/zh-CN/assets/pic/wechat_account.jpg) |
-## 4. Citation
 ```bibtex
 @article{zhu2025zipvoice,

 # ZipVoice⚡: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching</center>
+This model consists of checkpoints for two fast and high-quality non-autoregressive zero-shot text-to-speech models:
+- **ZipVoice**, for single-speaker speech generation. Details in [paper](https://arxiv.org/abs/2506.13053) and [demo](https://zipvoice.github.io).
+- **ZipVoice-Dialog**, for spoken dialogue generation. Details in [paper](https://arxiv.org/abs/2507.09318) and [demo](https://zipvoice-dialog.github.io).
+See our Github repository [ZipVoice](https://github.com/k2-fsa/ZipVoice) for instructions on using our models.
 ## 1. Explanation of each directory
 | zipvoice_dialog_opendialog     | ZipVoice-Dialog           | OpenDialog                        | zipvoice/model.pt          |
 | zipvoice_dialog_stereo         | ZipVoice-Dialog-Stereo    | in-house dataset                  | zipvoice_dialog/model.pt   |
+## 2. Discussion & Communication
 You can directly discuss on [Github Issues](https://github.com/k2-fsa/ZipVoice/issues).
 | ------------ | ----------------------- |
 |![wechat](https://k2-fsa.org/zh-CN/assets/pic/wechat_group.jpg) |![wechat](https://k2-fsa.org/zh-CN/assets/pic/wechat_account.jpg) |
+## 3. Citation
 ```bibtex
 @article{zhu2025zipvoice,