-
MMTEB: Massive Multilingual Text Embedding Benchmark
Paper • 2502.13595 • Published • 43 -
MTEB: Massive Text Embedding Benchmark
Paper • 2210.07316 • Published • 6 -
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
Paper • 2406.02396 • Published -
Extending the Massive Text Embedding Benchmark to French
Paper • 2405.20468 • Published • 2
AI & ML interests
Massive Text Embeddings Benchmark
Recent Activity
Papers
HUME: Measuring the Human-Model Performance Gap in Text Embedding Task
Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks
Organization Card
MTEB is a Python framework for evaluating embeddings and retrieval systems for both text and image. MTEB covers more than 1000 languages and diverse tasks, from classics like classification and clustering to use-case specialized tasks such as legal, code, or healthcare retrieval.
You can get started using mteb, check out our documentation.
| Overview | |
|---|---|
| 📈 Leaderboard | The interactive leaderboard of the benchmark |
| Get Started. | |
| 🏃 Get Started | Overview of how to use mteb |
| 🤖 Defining Models | How to use existing model and define custom ones |
| 📋 Selecting tasks | How to select tasks, benchmarks, splits etc. |
| 🏭 Running Evaluation | How to run the evaluations, including cache management, speeding up evaluations etc. |
| 📊 Loading Results | How to load and work with existing model results |
| Overview. | |
| 📋 Tasks | Overview of available tasks |
| 📐 Benchmarks | Overview of available benchmarks |
| 🤖 Models | Overview of available Models |
| Contributing | |
| 🤖 Adding a model | How to submit a model to MTEB and to the leaderboard |
| 👩💻 Adding a dataset | How to add a new task/dataset to MTEB |
| 👩💻 Adding a benchmark | How to add a new benchmark to MTEB and to the leaderboard |
| 🤝 Contributing | How to contribute to MTEB and set it up for development |
This is a collection of MTEB papers (not exhaustive).
This is a collection of MTEB papers (not exhaustive).
-
MMTEB: Massive Multilingual Text Embedding Benchmark
Paper • 2502.13595 • Published • 43 -
MTEB: Massive Text Embedding Benchmark
Paper • 2210.07316 • Published • 6 -
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
Paper • 2406.02396 • Published -
Extending the Massive Text Embedding Benchmark to French
Paper • 2405.20468 • Published • 2
A collection of items telated the the MMTEB release
datasets
1,282
mteb/results
Updated
•
37.2k
•
1
mteb/common_voice_21_0_mini
Viewer
•
Updated
•
133k
•
383
mteb/common_voice_17_0_mini
Viewer
•
Updated
•
65.5k
•
222
mteb/turkish-constitutional-court-violation-clean
Viewer
•
Updated
•
1.06k
•
9
mteb/MrTyDiJaRetrievalLite
Viewer
•
Updated
•
95k
•
21
mteb/MIRACLJaRetrievalLite
Viewer
•
Updated
•
108k
•
20
mteb/JaqketRetrievalLite
Viewer
•
Updated
•
66.8k
•
17
mteb/JaCWIRRetrievalLite
Viewer
•
Updated
•
311k
•
15
mteb/JaCWIRRerankingLite
Viewer
•
Updated
•
370k
•
17
mteb/JQaRARerankingLite
Viewer
•
Updated
•
192k
•
13