bhimrazy (Bhimraj Yadav)

upvoted an article 2 months ago

Article

Supercharge your OCR Pipelines with Open Models

+5

Oct 21, 2025

•

291

upvoted 9 papers 3 months ago

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6, 2025 • 127

Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR

Paper • 2509.18174 • Published Sep 17, 2025 • 128

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24, 2025 • 42

Docling Technical Report

Paper • 2408.09869 • Published Aug 19, 2024 • 2

Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion

Paper • 2501.17887 • Published Jan 27, 2025 • 1

Qwen3-Omni Technical Report

Paper • 2509.17765 • Published Sep 22, 2025 • 143

upvoted 4 papers 4 months ago

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 72

MobileCLIP2: Improving Multi-Modal Reinforced Training

Paper • 2508.20691 • Published Aug 28, 2025 • 5

VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 139

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28, 2025 • 63

upvoted an article 8 months ago

Article

Vision Language Models (Better, faster, stronger)

+3

May 12, 2025

•

580

upvoted 2 papers 9 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 202

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 168

upvoted a paper 10 months ago

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 170

upvoted 2 articles 10 months ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

+2

Mar 4, 2025

•

78

Article

The Beginners Guide to Cleaning a Dataset

Nov 18, 2024

•

24

Bhimraj Yadav

AI & ML interests

Organizations

Supercharge your OCR Pipelines with Open Models

Agent Learning via Early Experience

TTRV: Test-Time Reinforcement Learning for Vision Language Models

Less is More: Recursive Reasoning with Tiny Networks

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR

EmbeddingGemma: Powerful and Lightweight Text Representations

Docling Technical Report

Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion

Qwen3-Omni Technical Report

FastVLM: Efficient Vision Encoding for Vision Language Models

MobileCLIP2: Improving Multi-Modal Reinforced Training

VibeVoice Technical Report

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Vision Language Models (Better, faster, stronger)

SmolVLM: Redefining small and efficient multimodal models

Qwen2.5-Omni Technical Report

Transformers without Normalization

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

The Beginners Guide to Cleaning a Dataset

Bhimraj Yadav

AI & ML interests

Organizations

bhimrazy's activity

Supercharge your OCR Pipelines with Open Models

Vision Language Models (Better, faster, stronger)

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

The Beginners Guide to Cleaning a Dataset

🎉 Free Image Generator Now Available!