Running Featured 60 Distilling 100B+ Models 40x Faster with TRL 📝 60 TRL distillation for 100B+ teachers, 40x faster
view article Article Multimodal Embedding & Reranker Models with Sentence Transformers 8 days ago • 43
view article Article AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality Jan 21 • 33
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 • 126
Running on CPU Upgrade 220 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 220 Explore synthetic data experiments on a virtual bookshelf
Running on CPU Upgrade Featured 3.11k The Smol Training Playbook 📚 3.11k The secrets to building world-class LLMs
Running Featured 71 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 71 Who needs 1T parameters? Olympiad proofs with a 4B model
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16, 2025 • 124
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 May 24, 2023 • 176