Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • SoraWatermarkRemover

  • Log In
  • Sign Up

MoeReward
/
rl_checkpoints

Safetensors
Model card Files Files and versions
xet
Community
rl_checkpoints
315 GB
  • 1 contributor
History: 7 commits
shengyi-qian's picture
shengyi-qian
upload diffdomain1
eaa1e52 6 months ago
  • qwen1.5_base_rule_base_arc_heavy_diffdomain1_reward_func
    upload diffdomain1 6 months ago
  • qwen1.5_base_rule_base_arc_heavy_grpo_naive
    three checkpoints 7 months ago
  • qwen1.5_base_rule_base_equal_dist_grpo_diff_reward_func
    upload diff rewards 7 months ago
  • qwen1.5_base_rule_base_equal_dist_grpo_naive
    three checkpoints 7 months ago
  • qwen1.5_base_rule_base_grpo_naive
    qwen1.5 rule based 7 months ago
  • qwen1.5_base_rule_base_imdb_heavy_diffdomain1_reward_func
    upload diffdomain1 6 months ago
  • qwen1.5_base_rule_base_imdb_heavy_grpo_naive
    three checkpoints 7 months ago
  • qwen1.5_base_rule_base_math_heavy_diffdomain1_reward_func
    upload diffdomain1 6 months ago
  • qwen1.5_base_rule_base_math_heavy_grpo_diff_reward_func
    upload diff rewards 7 months ago
  • qwen1.5_base_rule_base_nq_heavy_diffdomain1_reward_func
    upload diffdomain1 6 months ago
  • qwen1.5_base_rule_base_nq_heavy_grpo_naive
    nq checkpoint 7 months ago
  • .gitattributes
    1.56 kB
    qwen1.5 rule based 7 months ago