EgoCS-400K: An Egocentric Gameplay Dataset for World Models Paper • 2606.18180 • Published 18 days ago • 15
Embodied-R1.5: Evolving Physical Intelligence via Embodied Foundation Models Paper • 2606.11324 • Published 25 days ago • 170
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published about 1 month ago • 126
OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning Paper • 2605.28691 • Published May 27 • 24
REPOT: Recoverable Program-of-Thought via Checkpoint Repair Paper • 2605.30052 • Published May 28 • 10
OpenComputer: Verifiable Software Worlds for Computer-Use Agents Paper • 2605.19769 • Published May 19 • 85
Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation Paper • 2605.12975 • Published May 13 • 9
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published May 13 • 274
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published May 7 • 238
River-LLM: Large Language Model Seamless Exit Based on KV Share Paper • 2604.18396 • Published Apr 20 • 6
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published Apr 6 • 236
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 266