llm-blender/PairRM
Text Generation
•
Updated
•
382
•
205
A curated collection of reward models to use with techniques like rejection sampling and RLHF / RLAIF
Totally Free + Zero Barriers + No Login Required