The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5 • 46
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 129
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training Paper • 2410.19313 • Published Oct 25, 2024 • 19 • 5
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training Paper • 2410.19313 • Published Oct 25, 2024 • 19 • 5
On Memorization of Large Language Models in Logical Reasoning Paper • 2410.23123 • Published Oct 30, 2024 • 18
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Jul 21 • 632
Xwin-LM: Strong and Scalable Alignment Practice for LLMs Paper • 2405.20335 • Published May 30, 2024 • 18
Common 7B Language Models Already Possess Strong Math Capabilities Paper • 2403.04706 • Published Mar 7, 2024 • 21