-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 69 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 197 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper • 2504.01833 • Published • 22 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 242
Collections
Discover the best community collections!
Collections including paper arxiv:2406.17557
-
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Paper • 2406.19389 • Published • 55 -
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 98 -
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
Paper • 2407.02485 • Published • 5 -
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Paper • 2407.01370 • Published • 90
-
SEA-LION: Southeast Asian Languages in One Network
Paper • 2504.05747 • Published -
Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings
Paper • 2408.02237 • Published -
A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs
Paper • 2406.17377 • Published -
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
Paper • 2306.11372 • Published
-
Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on Hugging Face
Paper • 2401.13822 • Published • 1 -
Attention Is All You Need
Paper • 1706.03762 • Published • 81 -
HuggingFace's Transformers: State-of-the-art Natural Language Processing
Paper • 1910.03771 • Published • 19 -
Model Cards for Model Reporting
Paper • 1810.03993 • Published • 5
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 80 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 41 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 17
-
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 98 -
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Paper • 2406.16860 • Published • 61 -
Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity
Paper • 2406.17720 • Published • 8 -
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Paper • 2406.20094 • Published • 105
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 69 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 197 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper • 2504.01833 • Published • 22 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 242
-
SEA-LION: Southeast Asian Languages in One Network
Paper • 2504.05747 • Published -
Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings
Paper • 2408.02237 • Published -
A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs
Paper • 2406.17377 • Published -
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
Paper • 2306.11372 • Published
-
Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on Hugging Face
Paper • 2401.13822 • Published • 1 -
Attention Is All You Need
Paper • 1706.03762 • Published • 81 -
HuggingFace's Transformers: State-of-the-art Natural Language Processing
Paper • 1910.03771 • Published • 19 -
Model Cards for Model Reporting
Paper • 1810.03993 • Published • 5
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 80 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 41 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 17
-
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 98 -
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Paper • 2406.16860 • Published • 61 -
Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity
Paper • 2406.17720 • Published • 8 -
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Paper • 2406.20094 • Published • 105
-
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Paper • 2406.19389 • Published • 55 -
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 98 -
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
Paper • 2407.02485 • Published • 5 -
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Paper • 2407.01370 • Published • 90