WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks Paper • 2601.02439 • Published 5 days ago • 15
VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation Paper • 2510.23497 • Published Oct 27, 2025 • 1
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes Paper • 2306.13649 • Published Jun 23, 2023 • 29
Running 76 Unlocking On-Policy Distillation for Any Model Family 📝 76 Apply on-policy distillation to any model family
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26, 2025 • 59