Edit Models filters

Inference Providers

HF Inference API

Misc

visual-question-answering

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

593

Full-text search

Active filters: visual-question-answering

microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • 6B • Updated May 1 • 432k • 1.48k

YannQi/R-4B

Visual Question Answering • 5B • Updated 1 day ago • 11.2k • 16

Salesforce/blip2-opt-2.7b

Image-Text-to-Text • 4B • Updated Feb 3 • 656k • 409

Salesforce/blip-vqa-base

Visual Question Answering • 0.4B • Updated Feb 3 • 410k • 168

Salesforce/blip2-flan-t5-xl

Image-Text-to-Text • 4B • Updated Feb 3 • 117k • 85

google/matcha-plotqa-v2

Visual Question Answering • Updated Jul 22, 2023 • 111 • 12

paragon-AI/blip2-image-to-text

Image-to-Text • Updated Jun 24, 2023 • 1.22k • 29

openbmb/MiniCPM-Llama3-V-2_5-int4

Visual Question Answering • 5B • Updated Feb 27 • 11.1k • 76

internlm/internlm-xcomposer2d5-7b

Visual Question Answering • Updated Jul 22, 2024 • 1.24M • 208

erax-ai/EraX-VL-2B-V1.5

Visual Question Answering • 2B • Updated Jan 15 • 2.29k • 10

google/cxr-foundation

Image Classification • Updated Feb 20 • 252 • 84

DAMO-NLP-SG/VideoLLaMA3-7B

Visual Question Answering • 8B • Updated Mar 20 • 100k • 65

DAMO-NLP-SG/VideoLLaMA3-2B-Image

Visual Question Answering • 2B • Updated Mar 20 • 176 • 8

yuxianglai117/Med-R1

Visual Question Answering • Updated Jul 7 • 8

OneEyeDJ/Emotionally-Aware_AI_Companion

Visual Question Answering • 8B • Updated 3 days ago • 9 • 1

dandelin/vilt-b32-finetuned-vqa

Visual Question Answering • Updated Aug 2, 2022 • 112k • 414

azwierzc/vilt-b32-finetuned-vqa-pl

Visual Question Answering • Updated Mar 21, 2022 • 6

Bingsu/temp_vilt_vqa

Visual Question Answering • Updated Nov 28, 2022 • 1

microsoft/git-base-vqav2

Visual Question Answering • 0.2B • Updated Mar 9, 2024 • 226 • 19

microsoft/git-base-textvqa

Visual Question Answering • 0.2B • Updated Mar 29, 2024 • 495 • 6

Salesforce/blip-vqa-capfilt-large

Visual Question Answering • Updated Feb 3 • 83.3k • 52

tufa15nik/vilt-finetuned-vqasi

Visual Question Answering • Updated Dec 15, 2022 • 20

microsoft/git-large-vqav2

Visual Question Answering • 0.4B • Updated Sep 7, 2023 • 733 • 18

microsoft/git-large-textvqa

Visual Question Answering • 0.4B • Updated Apr 9, 2024 • 222 • 4

ivelin/donut-refexp-combined-v1

Visual Question Answering • Updated Feb 7, 2023 • 196 • 4

tifa-benchmark/promptcap-coco-vqa

Image-to-Text • Updated Dec 11, 2023 • 35 • 12

sheldonxxxx/OFA_model_weights

Visual Question Answering • Updated Feb 8, 2023 • 1

Salesforce/blip2-opt-6.7b

Image-Text-to-Text • 8B • Updated Feb 3 • 4.24k • 78

Salesforce/blip2-opt-2.7b-coco

Image-to-Text • 4B • Updated Feb 3 • 633k • 9

Salesforce/blip2-opt-6.7b-coco

Image-Text-to-Text • 8B • Updated Feb 3 • 88.9k • 34