-
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5
Text Generation • 50B • Updated • 20.2k • 181 -
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
Text Generation • 50B • Updated • 2.51k • 15 -
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Text Generation • 253B • Updated • 6.61k • • 328 -
nvidia/Llama-3_3-Nemotron-Super-49B-v1
Text Generation • 50B • Updated • 22.1k • • 320
Collections
Discover the best community collections!
Collections including paper arxiv:2505.00949
-
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 123 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 137 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 51 -
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 63
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 5
-
The Leaderboard Illusion
Paper • 2504.20879 • Published • 70 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 68 -
LLMs for Engineering: Teaching Models to Design High Powered Rockets
Paper • 2504.19394 • Published • 14 -
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions
Paper • 2504.19056 • Published • 18
-
Human-like Episodic Memory for Infinite Context LLMs
Paper • 2407.09450 • Published • 63 -
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Paper • 2407.09435 • Published • 23 -
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Paper • 2407.09121 • Published • 6 -
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Paper • 2407.14482 • Published • 27
-
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5
Text Generation • 50B • Updated • 20.2k • 181 -
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
Text Generation • 50B • Updated • 2.51k • 15 -
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Text Generation • 253B • Updated • 6.61k • • 328 -
nvidia/Llama-3_3-Nemotron-Super-49B-v1
Text Generation • 50B • Updated • 22.1k • • 320
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 5
-
The Leaderboard Illusion
Paper • 2504.20879 • Published • 70 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 68 -
LLMs for Engineering: Teaching Models to Design High Powered Rockets
Paper • 2504.19394 • Published • 14 -
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions
Paper • 2504.19056 • Published • 18
-
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 123 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 137 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 51 -
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 63
-
Human-like Episodic Memory for Infinite Context LLMs
Paper • 2407.09450 • Published • 63 -
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Paper • 2407.09435 • Published • 23 -
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Paper • 2407.09121 • Published • 6 -
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Paper • 2407.14482 • Published • 27