amd
/

hecui102 commited on
Commit
f9ad64d
·
verified ·
1 Parent(s): 3be4669

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -15
README.md CHANGED
@@ -4,6 +4,7 @@ datasets:
4
  - nkp37/OpenVid-1M
5
  - TempoFunk/webvid-10M
6
  ---
 
7
  ⚡️ In this work, we present **AMD Hummingbird-I2V**, a compact and efficient **diffusion-based** I2V model designed for high-quality video synthesis under limited
8
  computational budgets.Hummingbird-I2V adopts a lightweight **U-Net** architecture with **0.9B parameters** and a novel two-stage training strategy guided by
9
  **reward-based feedback**, resulting in substantial improvements in inference speed, model efficiency, and visual quality. To further improve output resolution with minimal
@@ -13,7 +14,6 @@ inference steps on an AMD Radeon™ RX 7900 XTX GPU. Quantitative results on th
13
  U-Net-based diffusion models and competitive results compared to significantly larger DiT-based models. We provide a detailed analysis of the model architecture, training
14
  methodology, and benchmark performance.
15
 
16
- <img src="src/key_takeway.png" alt="key_takeway" title="key_takeway" class="key_takeway">
17
 
18
  <img src="src/i2v_training_pipeline.png" alt="i2v_training_pipeline" title="i2v_training_pipeline" class="i2v_training_pipeline">
19
 
@@ -50,22 +50,28 @@ methodology, and benchmark performance.
50
  </style>
51
 
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
 
55
 
56
 
57
 
58
-
59
- | Model | I2V Subj | I2V Bkg | Cam Mot | Subj Cons | Bkg Cons | Mot Smo | Dyn Deg | Aes Qual | Img Qual | Total Score |
60
- |---------------------|----------|---------|---------|-----------|-----------|----------|----------|-----------|-----------|--------------|
61
- | CogVideoSFT | 97.67% | 98.76% | 84.93% | 95.47% | 98.30% | 98.35% | 36.51% | 59.76% | 67.64% | 87.98% |
62
- | CogVideoX-12V-5B | 98.87% | 99.08% | 76.25% | 96.99% | 99.02% | 98.85% | 21.79% | 60.76% | 69.53% | 88.21% |
63
- | Step-Video-T12V | 97.44% | 98.45% | 48.15% | 95.62% | 96.92% | 99.08% | 48.78% | 61.74% | 70.17% | 87.98% |
64
- | HunYuan | - | - | - | - | 93.85% | 99.39% | - | - | - | - |
65
- | Wan-2.1-14B | - | - | - | - | 98.46% | 96.07% | - | - | - | - |
66
- | Animate-Anything | 98.76% | 98.58% | 13.08% | 98.90% | 98.19% | 98.61% | 2.68% | 67.12% | 72.09% | 86.48% |
67
- | SEINE-512 | 97.15% | 96.94% | 20.97% | 95.28% | 97.12% | 97.12% | 27.07% | 64.55% | 71.39% | 85.52% |
68
- | I2VGen-XL | 96.48% | 96.83% | 18.46% | 95.45% | 96.42% | 98.03% | 24.08% | 64.82% | 69.14% | 85.28% |
69
- | ConsistI2V | 95.82% | 95.95% | 33.92% | 95.27% | 94.38% | 97.38% | 18.62% | 59.00% | 66.92% | 84.91% |
70
- | DynamiCrafter-512 | 97.05% | 97.56% | 20.92% | 94.74% | 98.29% | 97.83% | 40.57% | 58.71% | 62.28% | 85.25% |
71
- | Hummingbird-I2V | 96.30% | 96.39% | 12.69% | 97.10% | 98.60% | 98.24% | 62.60% | 64.45% | 69.27% | 87.05% |
 
4
  - nkp37/OpenVid-1M
5
  - TempoFunk/webvid-10M
6
  ---
7
+ # AMD Hummingbird image-to-video Model
8
  ⚡️ In this work, we present **AMD Hummingbird-I2V**, a compact and efficient **diffusion-based** I2V model designed for high-quality video synthesis under limited
9
  computational budgets.Hummingbird-I2V adopts a lightweight **U-Net** architecture with **0.9B parameters** and a novel two-stage training strategy guided by
10
  **reward-based feedback**, resulting in substantial improvements in inference speed, model efficiency, and visual quality. To further improve output resolution with minimal
 
14
  U-Net-based diffusion models and competitive results compared to significantly larger DiT-based models. We provide a detailed analysis of the model architecture, training
15
  methodology, and benchmark performance.
16
 
 
17
 
18
  <img src="src/i2v_training_pipeline.png" alt="i2v_training_pipeline" title="i2v_training_pipeline" class="i2v_training_pipeline">
19
 
 
50
  </style>
51
 
52
 
53
+ <table>
54
+ <tr>
55
+ <td><img src="src/01.gif"></td>
56
+ <td><img src="src/02.gif"></td>
57
+ <td><img src="src/03.gif"></td>
58
+ <td><img src="src/04.gif"></td>
59
+ </tr>
60
+ <tr>
61
+ <td><img src="src/05.gif"></td>
62
+ <td><img src="src/06.gif"></td>
63
+ <td><img src="src/07.gif"></td>
64
+ <td><img src="src/08.gif"></td>
65
+ </tr>
66
+ <tr>
67
+ <td><img src="src/09.gif"></td>
68
+ <td><img src="src/10.gif"></td>
69
+ <td><img src="src/11.gif"></td>
70
+ <td><img src="src/12.gif"></td>
71
+ </tr>
72
+ </table>
73
 
74
 
75
 
76
 
77