Upload folder using huggingface_hub
Browse files
    	
        README.md
    CHANGED
    
    | @@ -1,6 +1,20 @@ | |
| 1 | 
             
            ---
         | 
| 2 | 
             
            license: mit
         | 
| 3 | 
             
            pipeline_tag: image-text-to-text
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 4 | 
             
            ---
         | 
| 5 |  | 
| 6 | 
             
            # InternVL2-2B
         | 
| @@ -64,11 +78,11 @@ InternVL 2.0 is a multimodal large language model series, featuring models of va | |
| 64 |  | 
| 65 | 
             
            - For more details and evaluation reproduction, please refer to our [Evaluation Guide](https://internvl.readthedocs.io/en/latest/internvl2.0/evaluation.html).
         | 
| 66 |  | 
| 67 | 
            -
            - We simultaneously use InternVL and VLMEvalKit repositories for model evaluation. Specifically, the results reported for DocVQA, ChartQA, InfoVQA, TextVQA, MME, AI2D, MMBench, CCBench, MMVet, and SEED-Image were tested using the InternVL repository. OCRBench, RealWorldQA, HallBench, and MathVista were evaluated using the VLMEvalKit.
         | 
| 68 |  | 
| 69 | 
             
            - For MMMU, we report both the original scores (left side: evaluated using the InternVL codebase for InternVL series models, and sourced from technical reports or webpages for other models) and the VLMEvalKit scores (right side: collected from the OpenCompass leaderboard).
         | 
| 70 |  | 
| 71 | 
            -
            - Please note that evaluating the same model using different testing toolkits like InternVL and VLMEvalKit can result in slight differences, which is normal. Updates to code versions and variations in environment and hardware can also cause minor discrepancies in results.
         | 
| 72 |  | 
| 73 | 
             
            ### Video Benchmarks
         | 
| 74 |  | 
|  | |
| 1 | 
             
            ---
         | 
| 2 | 
             
            license: mit
         | 
| 3 | 
             
            pipeline_tag: image-text-to-text
         | 
| 4 | 
            +
            library_name: transformers
         | 
| 5 | 
            +
            base_model: 
         | 
| 6 | 
            +
              - OpenGVLab/InternViT-300M-448px
         | 
| 7 | 
            +
              - internlm/internlm2-chat-1_8b
         | 
| 8 | 
            +
            base_model_relation: finetune
         | 
| 9 | 
            +
            language:
         | 
| 10 | 
            +
              - multilingual
         | 
| 11 | 
            +
            tags:
         | 
| 12 | 
            +
              - internvl
         | 
| 13 | 
            +
              - vision
         | 
| 14 | 
            +
              - ocr
         | 
| 15 | 
            +
              - multi-image
         | 
| 16 | 
            +
              - video
         | 
| 17 | 
            +
              - custom_code
         | 
| 18 | 
             
            ---
         | 
| 19 |  | 
| 20 | 
             
            # InternVL2-2B
         | 
|  | |
| 78 |  | 
| 79 | 
             
            - For more details and evaluation reproduction, please refer to our [Evaluation Guide](https://internvl.readthedocs.io/en/latest/internvl2.0/evaluation.html).
         | 
| 80 |  | 
| 81 | 
            +
            - We simultaneously use [InternVL](https://github.com/OpenGVLab/InternVL) and [VLMEvalKit](https://github.com/open-compass/VLMEvalKit) repositories for model evaluation. Specifically, the results reported for DocVQA, ChartQA, InfoVQA, TextVQA, MME, AI2D, MMBench, CCBench, MMVet, and SEED-Image were tested using the InternVL repository. OCRBench, RealWorldQA, HallBench, and MathVista were evaluated using the VLMEvalKit.
         | 
| 82 |  | 
| 83 | 
             
            - For MMMU, we report both the original scores (left side: evaluated using the InternVL codebase for InternVL series models, and sourced from technical reports or webpages for other models) and the VLMEvalKit scores (right side: collected from the OpenCompass leaderboard).
         | 
| 84 |  | 
| 85 | 
            +
            - Please note that evaluating the same model using different testing toolkits like [InternVL](https://github.com/OpenGVLab/InternVL) and [VLMEvalKit](https://github.com/open-compass/VLMEvalKit) can result in slight differences, which is normal. Updates to code versions and variations in environment and hardware can also cause minor discrepancies in results.
         | 
| 86 |  | 
| 87 | 
             
            ### Video Benchmarks
         | 
| 88 |  | 
