Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
OpenGVLab 's Collections
Docopilot
ZeroGUI
InternVL3
VisualPRM
Mono-InternVL
VideoChat-R1
PIIP
InternVideo2.5
VideoMAE-v2
VideoChat-Flash
InternVL2.5
InternVL2.5-MPO
InternVL2.0
InternVL1.5
InternVL1.0
V2PE
InternVL Adaptation
InternVideo2
VideoChat
VideoMamba
InternVid
OmniCorpus
All-Seeing Project
InternImage
PVT v2
InternVL Data

Mono-InternVL

updated Jul 22

A Pioneering Monolithic MLLM

Upvote
6

  • Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

    Paper • 2410.08202 • Published Oct 10, 2024 • 4

    Note CVPR 2025


  • Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models

    Paper • 2507.12566 • Published Jul 16 • 14

  • OpenGVLab/Mono-InternVL-2B

    Image-Text-to-Text • 3B • Updated Jul 22 • 13.4k • 36

  • OpenGVLab/Mono-InternVL-2B-S1-1

    Image-Text-to-Text • 3B • Updated Jul 22 • 9

  • OpenGVLab/Mono-InternVL-2B-S1-2

    Image-Text-to-Text • 3B • Updated Jul 22 • 16

  • OpenGVLab/Mono-InternVL-2B-S1-3

    Image-Text-to-Text • 3B • Updated Jul 22 • 86 • 1

  • OpenGVLab/Mono-InternVL-2B-Synthetic-Data

    Viewer • Updated Jul 22 • 3.05k • 89 • 2
Upvote
6
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略