Safetensors
llava_onevision
lkdhy's picture
Update README.md
ee0582e verified
|
raw
history blame
1.12 kB
metadata
license: apache-2.0
datasets:
  - Code2Logic/GameQA-140K
base_model:
  - llava-hf/llava-onevision-qwen2-7b-ov-hf

Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning

This is the first work, to the best of our knowledge, that leverages game code to synthesize multimodal reasoning data for training VLMs. Furthermore, when trained solely with a GRPO strategy on GameQA (synthesized via our proposed Code2Logic approach), multiple cutting-edge open-source models exhibit significantly enhanced out-of-domain generalization.

[📖 Paper] [🤗 GameQA-140K Dataset] [🤗 GameQA-InternVL3-8B ] [🤗 GameQA-Qwen2.5-VL-7B] [🤗 GameQA-LLaVA-OV-7B ]

News

  • We've open-sourced the three models trained with GRPO on GameQA on Huggingface.