A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning
Paper
•
2507.08267
•
Published
•
10
•
2
Totally Free + Zero Barriers + No Login Required