Preference Optimization for Reasoning with Pseudo Feedback
Paper
•
2411.16345
•
Published
•
1
Resources for the paper Preference Optimization for Reasoning with Pseudo Feedback (ICLR 2025)
Totally Free + Zero Barriers + No Login Required