Preference Optimization for Reasoning with Pseudo Feedback
Paper
• 2411.16345 • Published
• 1
Resources for the paper Preference Optimization for Reasoning with Pseudo Feedback (ICLR 2025)
Totally Free + Zero Barriers + No Login Required