Insights from the ICLR Peer Review and Rebuttal Process Paper • 2511.15462 • Published Nov 19, 2025 • 6
CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs Paper • 2510.09871 • Published Oct 10, 2025 • 2
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 75
How Programming Concepts and Neurons Are Shared in Code Language Models Paper • 2506.01074 • Published Jun 1, 2025 • 3
Tracing Multilingual Factual Knowledge Acquisition in Pretraining Paper • 2505.14824 • Published May 20, 2025 • 4
On Relation-Specific Neurons in Large Language Models Paper • 2502.17355 • Published Feb 24, 2025 • 10
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages Paper • 2410.23825 • Published Oct 31, 2024 • 4
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment Paper • 2410.05873 • Published Oct 8, 2024 • 3