view article Article Scaling Test-Time Compute to Achieve Gold Medal at IOI 2025 with Open-Weight Models Oct 20, 2025 • 20
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 Sep 11, 2025 • 176
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 75
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21, 2025 • 247
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4, 2025 • 252
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30, 2024 • 81
view article Article Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖 Jun 20, 2024 • 26