metadata
license: mit
datasets:
- Kwai-Keye/Thyme-SFT
- Kwai-Keye/Thyme-RL
language:
- en
metrics:
- accuracy
base_model:
- Qwen/Qwen2.5-VL-7B-Instruct
pipeline_tag: image-text-to-text

[📖 Home Page] [📖 Github Repo] [📖 Technique Report]
[📊 Thyme SFT Model] [📊 Thyme RL Model] [📝 SFT Data] [📝 RL Data]
🔥 News
2025.08.15
🌟 We are excited to introduce Thyme: Think Beyond Images. Thyme transcends traditional ``thinking with images'' paradigms by autonomously generating and executing diverse image processing and computational operations through executable code, significantly enhancing performance on high-resolution perception and complex reasoning tasks. Leveraging a novel two-stage training strategy that combines supervised fine-tuning with reinforcement learning and empowered by the innovative GRPO-ATS algorithm, Thyme achieves a sophisticated balance between reasoning exploration and code execution precision.

We have provided the usage instructions, training code, and evaluation code in the GitHub repo.
Citation
If you find Thyme useful in your research or applications, please cite our paper:
@article{zhang2025thyme,
title={Thyme: Think Beyond Images},
author={Kwai Keye},
journal={arXiv preprint},
year={2025}
}