OpenGVLab
/

PVC-InternVL2-8B

Image-Text-to-Text

feature-extraction

token compression

Model card Files Files and versions

cyyang822 commited on Dec 13, 2024

Commit

3b68028

·

1 Parent(s): f764292

update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -16,7 +16,9 @@ tags:
 # PVC-InternVL2-8B
 [\[📂 GitHub\]](https://github.com/OpenGVLab/PVC)
 ## Introduction
@@ -227,6 +229,19 @@ response = model.chat(tokenizer, pixel_values, question, generation_config, data
 print(f'User: {question}\nAssistant: {response}')
 ```
 ## License
 This project is released under the MIT license. Parts of this project contain code and models from other sources, which are subject to their respective licenses.

 # PVC-InternVL2-8B
+[\[📜 Paper\]](https://arxiv.org/abs/2412.09613)
 [\[📂 GitHub\]](https://github.com/OpenGVLab/PVC)
+[\[🚀 Quick Start\]](#quick-start)
 ## Introduction
 print(f'User: {question}\nAssistant: {response}')
 ```
+## 🖊️ Citation
+If you find this work helpful in your research, please consider citing:
+```bibtex
+@article{yang2024pvc,
+  title={PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models},
+  author={Yang, Chenyu and Dong, Xuan and Zhu, Xizhou and Su, Weijie and Wang, Jiahao and Tian, Hao and Chen, Zhe and Wang, Wenhai and Lu, Lewei and and Dai, Jifeng},
+  journal={arXiv preprint arXiv:2412.09613},
+  year={2024}
+}
+```
 ## License
 This project is released under the MIT license. Parts of this project contain code and models from other sources, which are subject to their respective licenses.