|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- Wan-AI/Wan2.1-T2V-1.3B |
|
pipeline_tag: text-to-video |
|
--- |
|
<p align="center"> |
|
<img src="assets/icon.png" height=25> |
|
</p> |
|
|
|
<h1 align='center'>EchoShot: Multi-Shot Portrait Video Generation</h1> |
|
<p align="center"> |
|
<strong><a href="https://scholar.google.com/citations?hl=en&user=zQnTBEoAAAAJ">Jiahao Wang</a><sup>1</sup></strong> |
|
· |
|
<strong><a href="https://scholar.google.com/citations?user=73JaDUQAAAAJ&hl=en&oi=ao">Hualian Sheng</a><sup>2</sup></strong> |
|
· |
|
<strong><a href="https://scholar.google.com/citations?user=LMVeRVAAAAAJ&hl=en&oi=ao">Sijia Cai</a><sup>2,†</sup></strong> |
|
· |
|
<strong><a href="https://gr.xjtu.edu.cn/web/zhangwzh123/">Weizhan Zhang</a><sup>1,*</sup></strong><br> |
|
<strong><a href="https://gr.xjtu.edu.cn/web/yancaixia">Caixia Yan</a><sup>1</sup></strong> |
|
· |
|
<strong><a href="">Yachuang Feng</a><sup>2</sup></strong> |
|
. |
|
<strong><a href="https://scholar.google.com/citations?user=VQp_ye4AAAAJ&hl=zh-CN&oi=ao">Bing Deng</a><sup>2</sup></strong> |
|
. |
|
<strong><a href="https://scholar.google.com/citations?user=T9AzhwcAAAAJ&hl=zh-CN&oi=ao">Jieping Ye</a><sup>2</sup></strong> |
|
<br> |
|
<br> |
|
<sup>1</sup>Xi'an Jiaotong University |
|
<sup>2</sup>Alibaba Cloud |
|
<br> |
|
<br> |
|
<a href="https://arxiv.org/abs/2506.15838"><img src='https://img.shields.io/badge/+-arXiv-red' alt='Paper PDF'></a> |
|
<a href="https://johnneywang.github.io/EchoShot-webpage/"><img src='https://img.shields.io/badge/+-Project_Page-blue' alt='Project Page'></a> |
|
<a href="https://github.com/JoHnneyWang/EchoShot"><img src='https://img.shields.io/badge/+-Github_Page-green' alt='Github Page'></a> |
|
<br> |
|
</p> |
|
|
|
## 📝 Intro |
|
This is the official model of EchoShot, which allows users to generate **multiple video shots showing the same person, controlled by customized prompts**. Currently it supports text-to-multishot portrait video generation. Hope you have fun with this demo! |
|
<div align="center"> |
|
<img src="assets/teasor.jpg", width="1200"> |
|
</div> |
|
|
|
|
|
## 🔔 News |
|
- July 15, 2025: 🔥 EchoShot-1.3B-preview is now available at [HuggingFace](https://huggingface.co/JonneyWang/EchoShot)! |
|
- July 15, 2025: 🎉 Release code of inference and training codes. |
|
- May 25, 2025: We propose [EchoShot](https://johnneywang.github.io/EchoShot-webpage/), a multi-shot portrait video generation model. |
|
|
|
|
|
## 📖 Citation |
|
If you are inspired by our work, please cite our paper. |
|
```bibtex |
|
@article{wang2025echoshot, |
|
title={EchoShot: Multi-Shot Portrait Video Generation}, |
|
author={Wang, Jiahao and Sheng, Hualian and Cai, Sijia and Zhang, Weizhan and Yan, Caixia and Feng, Yachuang and Deng, Bing and Ye, Jieping}, |
|
journal={arXiv preprint arXiv:2506.15838}, |
|
year={2025} |
|
} |
|
``` |