14 21 7

Yuzhe Gu

vanilla1116

https://guyuzhe.site/

Liqu1d-G

AI & ML interests

LLM; Reasoning; Hallucination; Self-Improvement

Recent Activity

commented on a paper 12 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

authored a paper 17 days ago

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

authored a paper 17 days ago

Intern-S1: A Scientific Multimodal Foundation Model

View all activity

Organizations

authored 4 papers 17 days ago

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Paper • 2508.03686 • Published Aug 5 • 37

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 259

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published 18 days ago • 33

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 18 days ago • 45

submitted 3 papers to Daily Papers 17 days ago

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Paper • 2512.10534 • Published 18 days ago • 31

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 18 days ago • 45

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published 18 days ago • 33

authored a paper 3 months ago

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published Oct 9 • 109

authored 2 papers 5 months ago

Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

Paper • 2507.16814 • Published Jul 22 • 21

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

Paper • 2507.13332 • Published Jul 17 • 48

authored 2 papers 10 months ago

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

Paper • 2503.02846 • Published Mar 4 • 19

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10 • 58

authored 3 papers over 1 year ago

Yuzhe Gu

AI & ML interests

Recent Activity

Organizations

vanilla1116's activity

🎉 Free Image Generator Now Available!