Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.14704

about 1 hour ago

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 17
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 32
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

Paper • 2508.14111 • Published 7 days ago • 25
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published 5 days ago • 30

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published 10 days ago • 25
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 256
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published 5 days ago • 30

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5 • 130
Magistral

Paper • 2506.10910 • Published Jun 12 • 63
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Paper • 2506.07240 • Published Jun 8 • 6
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Paper • 2506.09991 • Published Jun 11 • 56

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval

Paper • 2310.15511 • Published Oct 24, 2023 • 5
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Paper • 2310.14566 • Published Oct 23, 2023 • 27
SmartPlay : A Benchmark for LLMs as Intelligent Agents

Paper • 2310.01557 • Published Oct 2, 2023 • 13
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation

Paper • 2310.03214 • Published Oct 5, 2023 • 20

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published 5 days ago • 30

about 14 hours ago

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published 12 days ago • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published 11 days ago • 16
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published 19 days ago • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published 5 days ago • 39

about 23 hours ago

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published 14 days ago • 84
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published 19 days ago • 61
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published 17 days ago • 117
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published 18 days ago • 105

Fusion-Eval: Integrating Evaluators with LLMs

Paper • 2311.09204 • Published Nov 15, 2023 • 6
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer

Paper • 2311.06720 • Published Nov 12, 2023 • 9
Safurai 001: New Qualitative Approach for Code LLM Evaluation

Paper • 2309.11385 • Published Sep 20, 2023 • 2
Assessment of Pre-Trained Models Across Languages and Grammars

Paper • 2309.11165 • Published Sep 20, 2023 • 1

about 1 hour ago

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 17
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 32
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published 5 days ago • 30

From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

Paper • 2508.14111 • Published 7 days ago • 25
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published 5 days ago • 30

about 14 hours ago

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published 12 days ago • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published 11 days ago • 16
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published 19 days ago • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published 5 days ago • 39

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published 10 days ago • 25
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 256
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published 5 days ago • 30

about 23 hours ago

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published 14 days ago • 84
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published 19 days ago • 61
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published 17 days ago • 117
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published 18 days ago • 105

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5 • 130
Magistral

Paper • 2506.10910 • Published Jun 12 • 63
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Paper • 2506.07240 • Published Jun 8 • 6
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Paper • 2506.09991 • Published Jun 11 • 56

Fusion-Eval: Integrating Evaluators with LLMs

Paper • 2311.09204 • Published Nov 15, 2023 • 6
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer

Paper • 2311.06720 • Published Nov 12, 2023 • 9
Safurai 001: New Qualitative Approach for Code LLM Evaluation

Paper • 2309.11385 • Published Sep 20, 2023 • 2
Assessment of Pre-Trained Models Across Languages and Grammars

Paper • 2309.11165 • Published Sep 20, 2023 • 1

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval

Paper • 2310.15511 • Published Oct 24, 2023 • 5
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Paper • 2310.14566 • Published Oct 23, 2023 • 27
SmartPlay : A Benchmark for LLMs as Intelligent Agents

Paper • 2310.01557 • Published Oct 2, 2023 • 13
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation

Paper • 2310.03214 • Published Oct 5, 2023 • 20

Company

TOS Privacy About Jobs

Website

Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略