sjkwon/sft-mdo-diverse-train-nllb-200-600M-step200 Reinforcement Learning • 0.6B • Updated Oct 15, 2024
sjkwon/6750_sft-mdo-diverse-train-nllb-200-600M Reinforcement Learning • 0.6B • Updated Oct 31, 2024 • 1
sjkwon/5e-6_6528_sft-mdo-diverse-train-nllb-200-600M Reinforcement Learning • 0.6B • Updated Nov 1, 2024
sjkwon/2e-5_2184_sft-mdo-diverse-train-nllb-200-600M Reinforcement Learning • 0.6B • Updated Nov 1, 2024
sjkwon/1e-5_2000_sft-mdo-diverse-train-nllb-200-600M Reinforcement Learning • 0.6B • Updated Nov 1, 2024