Update README.md
Browse files
README.md
CHANGED
|
@@ -130,7 +130,7 @@ base_model:
|
|
| 130 |
|
| 131 |
## Model Summary
|
| 132 |
|
| 133 |
-
The LLaVA-NeXT-Video models are 7/72B parameter models trained on [LLaVA-
|
| 134 |
|
| 135 |
This model support at most 64 frames.
|
| 136 |
|
|
@@ -143,7 +143,7 @@ This model support at most 64 frames.
|
|
| 143 |
|
| 144 |
### Intended use
|
| 145 |
|
| 146 |
-
The model was trained on [LLaVA-
|
| 147 |
|
| 148 |
|
| 149 |
|
|
|
|
| 130 |
|
| 131 |
## Model Summary
|
| 132 |
|
| 133 |
+
The LLaVA-NeXT-Video models are 7/72B parameter models trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data) and [LLaVA-OneVision Dataset](https://huggingface.co/datasets/lmms-lab/LLaVA-OneVision-Data), based on Qwen2 language model with a context window of 32K tokens.
|
| 134 |
|
| 135 |
This model support at most 64 frames.
|
| 136 |
|
|
|
|
| 143 |
|
| 144 |
### Intended use
|
| 145 |
|
| 146 |
+
The model was trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data) and [LLaVA-OneVision Dataset](https://huggingface.co/datasets/lmms-lab/LLaVA-OneVision-Data), having the ability to interact with images, multi-image and videos, but specific to videos.
|
| 147 |
|
| 148 |
|
| 149 |
|