MattBou00/smolLM-360m-detox_try_3_stable_retry-ckpt-ep20-2025-08-18_18-34-45 Reinforcement Learning • 0.4B • Updated Aug 18 • 4
MattBou00/smolLM-360m-detox_try_3_stable_retry-ckpt-ep40-2025-08-18_18-34-45 Reinforcement Learning • 0.4B • Updated Aug 18 • 4
MattBou00/smolLM-360m-detox_try_4_closekl-ckpt-ep20-2025-08-18_18-50-03 Reinforcement Learning • 0.4B • Updated Aug 18 • 4
MattBou00/smolLM-360m-detox_try_4_closekl-ckpt-ep40-2025-08-18_18-50-03 Reinforcement Learning • 0.4B • Updated Aug 18 • 4
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated Aug 19 • 4
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated Aug 19 • 3
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated Aug 19 • 3
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated Aug 19 • 4
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated Aug 19 • 4
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated Aug 19 • 3