OpenGVLab
/

InternVL

Model card Files Files and versions

xet

Community

czczup commited on Dec 26, 2023

Commit

d980280

1 Parent(s): 61d9457

Update README.md

Browse files

Files changed (1) hide show

README.md +3 -6

README.md CHANGED Viewed

@@ -17,22 +17,19 @@ InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
 It is _**the largest open-source vision/vision-language foundation model (14B)**_ to date, achieving _**32 state-of-the-art**_ performances on a wide range of tasks such as visual perception, cross-modal retrieval, multimodal dialogue, etc.
-# Model Zoo
-## Pretrained Weights
 | model name              | type    | download                                                                                       |  size   |
 | ----------------------- | ------- | ---------------------------------------------------------------------------------------------- | :-----: |
 | InternViT-6B-224px      | pytorch | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL/blob/main/intern_vit_6b_224px.pth)      |  12 GB  |
-## Linear-Probe Image Classification
 | model name         | IN-1K | IN-ReaL | IN-V2 | IN-A | IN-R | IN-Sketch |                                                                                                         download                                                                                                  |
 | ------------------ | :---: | :-----: | :---: | :--: | :--: | :-------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
 | InternViT-6B-224px | 88.2  |  90.4   | 79.9  | 77.5 | 89.8 |   69.1    | [ckpt](https://huggingface.co/OpenGVLab/InternVL/resolve/main/intern_vit_6b_224px_head.pth) \| [log](https://github.com/OpenGVLab/InternVL/blob/main/classification/work_dirs/intern_vit_6b_1k_224/log_rank0.txt) |
-## Semantic Segmentation
 | type            | backbone              |  head   | mIoU |                                                   config                                                   |                                                                                                                      download                                                                                                                       |
 | --------------- | --------------------- | :-----: | :--: | :--------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |

 It is _**the largest open-source vision/vision-language foundation model (14B)**_ to date, achieving _**32 state-of-the-art**_ performances on a wide range of tasks such as visual perception, cross-modal retrieval, multimodal dialogue, etc.
+# Pretrained Weights
 | model name              | type    | download                                                                                       |  size   |
 | ----------------------- | ------- | ---------------------------------------------------------------------------------------------- | :-----: |
 | InternViT-6B-224px      | pytorch | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL/blob/main/intern_vit_6b_224px.pth)      |  12 GB  |
+# Linear-Probe Image Classification
 | model name         | IN-1K | IN-ReaL | IN-V2 | IN-A | IN-R | IN-Sketch |                                                                                                         download                                                                                                  |
 | ------------------ | :---: | :-----: | :---: | :--: | :--: | :-------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
 | InternViT-6B-224px | 88.2  |  90.4   | 79.9  | 77.5 | 89.8 |   69.1    | [ckpt](https://huggingface.co/OpenGVLab/InternVL/resolve/main/intern_vit_6b_224px_head.pth) \| [log](https://github.com/OpenGVLab/InternVL/blob/main/classification/work_dirs/intern_vit_6b_1k_224/log_rank0.txt) |
+# Semantic Segmentation
 | type            | backbone              |  head   | mIoU |                                                   config                                                   |                                                                                                                      download                                                                                                                       |
 | --------------- | --------------------- | :-----: | :--: | :--------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |