Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -6,10 +6,13 @@ base_model_relation: quantized
|
|
| 6 |
|
| 7 |
# jina-embeddings-v4-gguf
|
| 8 |
|
| 9 |
-
[Read this blog post for more details](https://jina.ai/news/optimizing-ggufs-for-decoder-only-embedding-models)
|
| 10 |
-
|
| 11 |
A collection of GGUF and quantizations for [`jina-embeddings-v4`](https://huggingface.co/jinaai/jina-embeddings-v4).
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
## Overview
|
| 14 |
|
| 15 |
`jina-embeddings-v4` is a cutting-edge universal embedding model [for multimodal multilingual retrieval](https://jina.ai/news/jina-embeddings-v4-universal-embeddings-for-multimodal-multilingual-retrieval). It's based on `qwen2.5-vl-3b-instruct` with three LoRA adapters: `retrieval` (optimized for retrieval tasks), `text-matching` (optimized for sentence similarity tasks), and `code` (optimized for code retrieval tasks). It is also heavily trained for visual document retrieval and late-interaction style multi-vector output.
|
|
@@ -206,4 +209,6 @@ main: number of embeddings = 5090
|
|
| 206 |
| Q8_0 | 0.7938 | 0.5687 | 0.4784 | 0.4335 | 0.7851 | +7% | +0% | +11% | +2% | +4% | -1% | +7% | -0% | +0% | -1% |
|
| 207 |
| F16 | 0.7940 | 0.5610 | 0.4931 | 0.4343 | 0.7963 | +7% | +0% | +9% | +1% | +7% | +2% | +7% | -0% | +2% | +0% |
|
| 208 |
| v3 (Transformers) | 0.7393 | 0.5144 | 0.4600 | 0.4068 | 0.7820 | +0% | -7% | +0% | -8% | +0% | -5% | +0% | -6% | +0% | -2% |
|
| 209 |
-
| v4 (Transformers) | 0.7977 | 0.5571 | 0.4844 | 0.4351 | 0.7963 | +8% | +0% | +8% | +0% | +5% | +0% | +7% | +0% | +2% | +0% |
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
# jina-embeddings-v4-gguf
|
| 8 |
|
|
|
|
|
|
|
| 9 |
A collection of GGUF and quantizations for [`jina-embeddings-v4`](https://huggingface.co/jinaai/jina-embeddings-v4).
|
| 10 |
|
| 11 |
+
> [!IMPORTANT]
|
| 12 |
+
> We highly recommend to first read [this blog post for more technical details and customized llama.cpp build](https://jina.ai/news/optimizing-ggufs-for-decoder-only-embedding-models).
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
|
| 16 |
## Overview
|
| 17 |
|
| 18 |
`jina-embeddings-v4` is a cutting-edge universal embedding model [for multimodal multilingual retrieval](https://jina.ai/news/jina-embeddings-v4-universal-embeddings-for-multimodal-multilingual-retrieval). It's based on `qwen2.5-vl-3b-instruct` with three LoRA adapters: `retrieval` (optimized for retrieval tasks), `text-matching` (optimized for sentence similarity tasks), and `code` (optimized for code retrieval tasks). It is also heavily trained for visual document retrieval and late-interaction style multi-vector output.
|
|
|
|
| 209 |
| Q8_0 | 0.7938 | 0.5687 | 0.4784 | 0.4335 | 0.7851 | +7% | +0% | +11% | +2% | +4% | -1% | +7% | -0% | +0% | -1% |
|
| 210 |
| F16 | 0.7940 | 0.5610 | 0.4931 | 0.4343 | 0.7963 | +7% | +0% | +9% | +1% | +7% | +2% | +7% | -0% | +2% | +0% |
|
| 211 |
| v3 (Transformers) | 0.7393 | 0.5144 | 0.4600 | 0.4068 | 0.7820 | +0% | -7% | +0% | -8% | +0% | -5% | +0% | -6% | +0% | -2% |
|
| 212 |
+
| v4 (Transformers) | 0.7977 | 0.5571 | 0.4844 | 0.4351 | 0.7963 | +8% | +0% | +8% | +0% | +5% | +0% | +7% | +0% | +2% | +0% |
|
| 213 |
+
|
| 214 |
+
|