Update README.md
Browse files
README.md
CHANGED
@@ -3,4 +3,12 @@ license: agpl-3.0
|
|
3 |
datasets:
|
4 |
- nkp37/OpenVid-1M
|
5 |
- TempoFunk/webvid-10M
|
6 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
datasets:
|
4 |
- nkp37/OpenVid-1M
|
5 |
- TempoFunk/webvid-10M
|
6 |
+
---
|
7 |
+
In this work, we present AMD Hummingbird-I2V, a compact and efficient diffusion-based I2V model designed for high-quality video synthesis under limited computational budgets.
|
8 |
+
Hummingbird-I2V adopts a lightweight U-Net architecture with 0.9B parameters and a novel two-stage training strategy guided by reward-based feedback, resulting in
|
9 |
+
substantial improvements in inference speed, model efficiency, and visual quality. To further improve output resolution with minimal overhead, we introduce a
|
10 |
+
super-resolution module at the end of the pipeline. Additionally, we leverage ReNeg, an AMD proposed reward-guided framework for learning negative embeddings via
|
11 |
+
gradient descent, to further boost visual quality. As a result, Hummingbird-I2V can generate high-quality 4K video in just 11 seconds with 16 inference steps on an AMD
|
12 |
+
Radeon™ RX 7900 XTX GPU. Quantitative results on the VBench-I2V benchmark show that Hummingbird-I2V achieves state-of-the-art performance among U-Net-based
|
13 |
+
diffusion models and competitive results compared to significantly larger DiT-based models. We provide a detailed analysis of the model architecture, training methodology,
|
14 |
+
and benchmark performance.
|