Update README.md
Browse files
README.md
CHANGED
@@ -13,16 +13,8 @@ pipeline_tag: sentence-similarity
|
|
13 |
---
|
14 |
# Granite-Embedding-30m-English (revision r1.1)
|
15 |
|
16 |
-
**News:**
|
17 |
-
Granite Embedding R2 models with 8192 context length released.
|
18 |
-
|
19 |
-
- [granite-embedding-english-r2](https://huggingface.co/ibm-granite/granite-embedding-english-r2) (149M parameters): with an output embedding size of 768, replacing granite-embedding-125m-english.
|
20 |
-
- [granite-embedding-small-english-r2](https://huggingface.co/ibm-granite/granite-embedding-small-english-r2) (47M parameters): A first-of-its-kind reduced-size model, with fewer layers and a smaller output embedding size (384), replacing granite-embedding-30m-english.
|
21 |
-
- granite-embedding-30m-r1.1 (30M parameters): A specialized version of the popular granite-embedding-30m-english adapted for multi-turn information retrieval.
|
22 |
-
|
23 |
**Model Summary:**
|
24 |
-
|
25 |
-
|
26 |
|
27 |
***Granite-embedding-30m-r1.1*** was specifically designed to support multi-turn information retrieval and is designed to handle contextual document retrieval in multi-turn conversational information retrieval. Granite-embedding-30m-r1.1 was trained on data tailored for multi-turn conversational information retrieval and uses multi-teacher distillation over granite-embedding-30m-english (https://huggingface.co/ibm-granite/granite-embedding-30m-english)
|
28 |
|
@@ -30,7 +22,7 @@ Granite Embedding R2 models with 8192 context length released.
|
|
30 |
- **GitHub Repository:** [ibm-granite/granite-embedding-models](https://github.com/ibm-granite/granite-embedding-models)
|
31 |
- **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
|
32 |
- **Paper:** [Technical Report](https://arxiv.org/abs/2502.20204)
|
33 |
-
- **Release Date**:
|
34 |
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
35 |
|
36 |
**Supported Languages:**
|
@@ -201,7 +193,7 @@ granite-embedding-30m-r1.1 revision maintains the fast speed of granite-embeddin
|
|
201 |
| Model | Parameters (M)| Embedding Dimension | MTEB Retrieval (15) | MT-RAG | Mdoc2dial |
|
202 |
|-------------------------------------------|:------------:|:-------------------:|:-------------------: |:----------:| :--------:|
|
203 |
|granite-embedding-30m-english |30 |384 |49.1 |49.16 | 85.42 |
|
204 |
-
|**granite-embedding-30m-english-r1.1
|
205 |
|bge-small-en-v1.5 |33 |512 |53.86 |38.26 | 83.71 |
|
206 |
|e5-small-v2 |33 |384 |48.46 |28.72 | 75.7 |
|
207 |
|
|
|
13 |
---
|
14 |
# Granite-Embedding-30m-English (revision r1.1)
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
**Model Summary:**
|
17 |
+
Granite-Embedding-30m-English is a 30M parameter dense bi-encoder embedding model from the Granite Embeddings suite that can be used to generate high quality text embeddings. This model produces embedding vectors of size 384 and is trained using a combination of open source relevance-pair datasets with permissive, enterprise-friendly license, and IBM collected and generated datasets. While maintaining competitive scores on academic benchmarks such as BEIR, this model also performs well on many enterprise use cases. This model is developed using retrieval oriented pre-training, contrastive fine-tuning, knowledge distillation and model merging for improved performance.
|
|
|
18 |
|
19 |
***Granite-embedding-30m-r1.1*** was specifically designed to support multi-turn information retrieval and is designed to handle contextual document retrieval in multi-turn conversational information retrieval. Granite-embedding-30m-r1.1 was trained on data tailored for multi-turn conversational information retrieval and uses multi-teacher distillation over granite-embedding-30m-english (https://huggingface.co/ibm-granite/granite-embedding-30m-english)
|
20 |
|
|
|
22 |
- **GitHub Repository:** [ibm-granite/granite-embedding-models](https://github.com/ibm-granite/granite-embedding-models)
|
23 |
- **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
|
24 |
- **Paper:** [Technical Report](https://arxiv.org/abs/2502.20204)
|
25 |
+
- **Release Date**: August 29, 2025 (granite-embedding-30m-english-r1.1)
|
26 |
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
27 |
|
28 |
**Supported Languages:**
|
|
|
193 |
| Model | Parameters (M)| Embedding Dimension | MTEB Retrieval (15) | MT-RAG | Mdoc2dial |
|
194 |
|-------------------------------------------|:------------:|:-------------------:|:-------------------: |:----------:| :--------:|
|
195 |
|granite-embedding-30m-english |30 |384 |49.1 |49.16 | 85.42 |
|
196 |
+
|**granite-embedding-30m-english-r1.1** |30 |384 |48.9 |**52.33** | **85.78** |
|
197 |
|bge-small-en-v1.5 |33 |512 |53.86 |38.26 | 83.71 |
|
198 |
|e5-small-v2 |33 |384 |48.46 |28.72 | 75.7 |
|
199 |
|