Nouamane Tazi's picture

Nouamane Tazi PRO

nouamanetazi

·

https://nouamanetazi.github.io

AI & ML interests

Scale it 'til you make it

Recent Activity

liked a Space 3 days ago

Victarry/PP-schedule-visualizer

updated a model 11 days ago

nouamanetazi/exp27

updated a model 11 days ago

nouamanetazi/exp26

View all activity

Organizations

upvoted an article about 2 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

By

and 22 others •

Jul 8

• 636

upvoted 3 collections about 2 months ago

SmolLM3 evaluation datasets

Datasets to decontaminate the post-training mixtures against. Use the subset and column values described per entry • 13 items • Updated Jul 8 • 5

SmolLM3 pretraining datasets

datasets used in SmolLM3 pretraining • 15 items • Updated 12 days ago • 28

🧠 SmolLM3

Smol, multilingual, long-context reasoner • 12 items • Updated 19 days ago • 71

upvoted an article about 2 months ago

Article

Bringing Fusion Down to Earth: ML for Stellarator Optimization

By

•

Jul 2

• 73

upvoted a paper about 2 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 69

upvoted 2 articles 2 months ago

Article

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub

By

and 6 others •

Jun 12

• 125

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

By

and 8 others •

Jun 3

• 234

upvoted an article 3 months ago

Article

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

By

and 2 others •

Mar 20, 2024

• 102

upvoted an article 4 months ago

Article

Tiny Agents: a MCP-powered agent in 50 lines of code

By

•

Apr 25

• 295

upvoted 3 articles 6 months ago

Article

Open R1: Update #3

By

and 9 others •

Mar 11

• 295

Article

Atlaset Dataset for Moroccan Darija: From Data Collection, Analysis, to Model Trainings

By

and 1 other •

Mar 6

• 26

Article

MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era

By

•

Jan 15

• 47

upvoted a paper 6 months ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19 • 38

upvoted 2 articles 6 months ago

Article

Open R1: Update #2

By

and 6 others •

Feb 10

• 218

Article

The Open Arabic LLM Leaderboard 2

By

and 7 others •

Feb 10

• 35

upvoted 2 articles 7 months ago

Article

Arabic RAG Leaderboard: A Comprehensive Framework for Evaluating Arabic Language Retrieval Systems

By

and 1 other •

Feb 9

• 13

Article

Darija Chatbot Arena: Making LLMs Compete in the Moroccan Dialect

By

and 2 others •

Feb 10

• 14

upvoted a paper 7 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 242

upvoted an article 7 months ago

Article

Open-R1: Update #1

By

and 7 others •

Feb 2

• 305