Representing Speech Through Autoregressive Prediction of Cochlear Tokens Paper • 2508.11598 • Published 9 days ago • 16
S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models Paper • 2508.12880 • Published 6 days ago • 42
Inverse-LLaVA: Eliminating Alignment Pre-training Through Text-to-Vision Mapping Paper • 2508.12466 • Published 7 days ago • 8
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model Paper • 2508.13009 • Published 6 days ago • 22
ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning Paper • 2508.10419 • Published 11 days ago • 69
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published 11 days ago • 46
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 128