Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -63,7 +63,7 @@ Since we utilizes a pre-trained Multimodal Large Language Model (MLLM) with a De | |
| 63 |  | 
| 64 | 
             
            The overall architecture of our system is designed to maximize the synergy between image and text modalities, ensuring a robust and coherent generation of video content from static images. This integration not only improves the fidelity of the generated videos but also enhances the model's ability to interpret and utilize complex multimodal inputs. The overall architecture is as follows.
         | 
| 65 | 
             
            <p align="center">
         | 
| 66 | 
            -
              <img src="https://raw.githubusercontent.com/Tencent/HunyuanVideo-I2V/refs/heads/main/assets/backbone.png"  style="max-width:  | 
| 67 | 
             
            </p>
         | 
| 68 |  | 
| 69 |  | 
| @@ -216,7 +216,7 @@ Prompt description: The trigger word is written directly in the video caption. I | |
| 216 |  | 
| 217 | 
             
            For example, AI hair growth effect (trigger): rapid_hair_growth, The hair of the characters in the video is growing rapidly. + original prompt
         | 
| 218 |  | 
| 219 | 
            -
            After having the training video and prompt pair, refer to [here] | 
| 220 |  | 
| 221 |  | 
| 222 | 
             
            ### Training
         | 
|  | |
| 63 |  | 
| 64 | 
             
            The overall architecture of our system is designed to maximize the synergy between image and text modalities, ensuring a robust and coherent generation of video content from static images. This integration not only improves the fidelity of the generated videos but also enhances the model's ability to interpret and utilize complex multimodal inputs. The overall architecture is as follows.
         | 
| 65 | 
             
            <p align="center">
         | 
| 66 | 
            +
              <img src="https://raw.githubusercontent.com/Tencent/HunyuanVideo-I2V/refs/heads/main/assets/backbone.png"  style="max-width: 45%; height: auto;">
         | 
| 67 | 
             
            </p>
         | 
| 68 |  | 
| 69 |  | 
|  | |
| 216 |  | 
| 217 | 
             
            For example, AI hair growth effect (trigger): rapid_hair_growth, The hair of the characters in the video is growing rapidly. + original prompt
         | 
| 218 |  | 
| 219 | 
            +
            After having the training video and prompt pair, refer to [here](https://github.com/Tencent/HunyuanVideo-I2V/blob/main/hyvideo/hyvae_extract/README.md) for training data construction.
         | 
| 220 |  | 
| 221 |  | 
| 222 | 
             
            ### Training
         | 
