Unpaired preference datasets - a trl-lib Collection

trl-lib 's Collections

Preference datasets

Stepwise supervision datasets

Prompt-completion datasets

Prompt-only datasets

Unpaired preference datasets

Comparing DPO with IPO and KTO

Unpaired preference datasets

updated Jan 8, 2025

trl-lib/ultrafeedback-gpt-3.5-turbo-helpfulness

Viewer • Updated Jan 8, 2025 • 16.6k • 66 • 4
trl-lib/kto-mix-14k

Viewer • Updated Mar 25, 2024 • 15k • 478 • 9