TTRV: Test-Time Reinforcement Learning for Vision Language Models Paper • 2510.06783 • Published Oct 8, 2025 • 11
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 501
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models Paper • 2510.04618 • Published Oct 6, 2025 • 127
Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR Paper • 2509.18174 • Published Sep 17, 2025 • 128
EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published Sep 24, 2025 • 42
Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion Paper • 2501.17887 • Published Jan 27, 2025 • 1
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 72
MobileCLIP2: Improving Multi-Modal Reinforced Training Paper • 2508.20691 • Published Aug 28, 2025 • 5
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28, 2025 • 63
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 202
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality +2 Mar 4, 2025 • 78