OpenGVLab
/

InternViT-6B-448px-V1-0

Image Feature Extraction

feature-extraction

Model card Files Files and versions

czczup commited on Feb 11, 2024

Commit

08c3036

·

verified ·

1 Parent(s): 6002d2b

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ datasets:
 ## What is InternVL?
-\[[Paper](https://arxiv.org/abs/2312.14238)\]  \[[GitHub](https://github.com/OpenGVLab/InternVL)\]
 InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
@@ -22,7 +22,7 @@ It is _**the largest open-source vision/vision-language foundation model (14B)**
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/k5UATwX5W2b5KJBN5C58x.png)
 ## Model Details
-- **Model Type:** feature backbone
 - **Model Stats:**
   - Params (M): 5903
   - Image size: 448 x 448
@@ -53,7 +53,7 @@ outputs = model(pixel_values)
 ## Citation
-If you find this project useful in your research, please consider cite:
 ```BibTeX
 @article{chen2023internvl,

 ## What is InternVL?
+\[[Paper](https://arxiv.org/abs/2312.14238)\]  \[[GitHub](https://github.com/OpenGVLab/InternVL)\]   \[[Chat Demo](https://internvl.opengvlab.com/)\]
 InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/k5UATwX5W2b5KJBN5C58x.png)
 ## Model Details
+- **Model Type:** vision foundation model, feature backbone
 - **Model Stats:**
   - Params (M): 5903
   - Image size: 448 x 448
 ## Citation
+If you find this project useful in your research, please consider citing:
 ```BibTeX
 @article{chen2023internvl,