Update README.md
Browse files
README.md
CHANGED
|
@@ -28,7 +28,9 @@ tags:
|
|
| 28 |
<img src="Vintern_logo.png" width="700"/>
|
| 29 |
</div>
|
| 30 |
|
| 31 |
-
|
|
|
|
|
|
|
| 32 |
|
| 33 |
We are excited to introduce **Vintern-1B-v2** the Vietnamese 🇻🇳 multimodal model that combines the advanced Vietnamese language model [Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct)[1] with the latest visual model, [InternViT-300M-448px](https://huggingface.co/OpenGVLab/InternViT-300M-448px)[2], CVPR 2024. This model excels in tasks such as OCR-VQA, Doc-VQA, and Chart-VQA,... With only 1 billion parameters, it is **4096 context length** finetuned from the Viet-InternVL-1B model on over 3 million specialized image-question-answer pairs for optical character recognition 🔍, text recognition 🔤, document extraction 📑, and general QA. The model can be integrated into various on-device applications 📱, demonstrating its versatility and robust capabilities.
|
| 34 |
|
|
|
|
| 28 |
<img src="Vintern_logo.png" width="700"/>
|
| 29 |
</div>
|
| 30 |
|
| 31 |
+
[\[🤗 HF Demo\]](https://huggingface.co/spaces/khang119966/Vintern-v2)
|
| 32 |
+
|
| 33 |
+
## Vintern-1B-v2 ❄️ (Viet-InternVL2-1B-v2) - The LLaVA 🌋 Challenger
|
| 34 |
|
| 35 |
We are excited to introduce **Vintern-1B-v2** the Vietnamese 🇻🇳 multimodal model that combines the advanced Vietnamese language model [Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct)[1] with the latest visual model, [InternViT-300M-448px](https://huggingface.co/OpenGVLab/InternViT-300M-448px)[2], CVPR 2024. This model excels in tasks such as OCR-VQA, Doc-VQA, and Chart-VQA,... With only 1 billion parameters, it is **4096 context length** finetuned from the Viet-InternVL-1B model on over 3 million specialized image-question-answer pairs for optical character recognition 🔍, text recognition 🔤, document extraction 📑, and general QA. The model can be integrated into various on-device applications 📱, demonstrating its versatility and robust capabilities.
|
| 36 |
|