guru-7B / README.md
zhoujun's picture
Update README.md
8868213 verified
|
raw
history blame
438 Bytes
metadata
library_name: transformers
pipeline_tag: text-generation
license: cc-by-nc-4.0

This repository contains the Guru-32B (base Qwen2.5-32B) model presented in Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective.

The score we evaluate with temperature=1.0, top_p=0.7.

Leaderboard

Please refer to the paper for more details.