jupyterjazz commited on
Commit
0cc9ab0
·
verified ·
1 Parent(s): 8e64b38

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +261 -60
README.md CHANGED
@@ -1,92 +1,293 @@
1
- # Jina Embeddings V4
2
 
 
 
 
3
 
4
- ## Examples
5
 
6
- Encode functions:
 
 
7
 
8
- ```python
9
- import torch
10
- from transformers import AutoModel
11
- from PIL import Image
12
 
13
- device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
14
 
15
- # Load model
16
- model = AutoModel.from_pretrained('jinaai/jina-embeddings-v4', trust_remote_code=True)
17
- model = model.to(device)
18
 
19
- # Sample data
20
- texts = ["Here is some sample code", "This is a matching text"]
21
- image_paths = ['/<path_to_image>']
22
- images = [Image.open(path) for path in image_paths]
23
 
24
- # Example 1: Text matching task with single vector embeddings
25
- # Generate embeddings with dimension truncation (256), decrease max_pixels
26
- img_embeddings = model.encode_images(images=images, truncate_dim=256, max_pixels=602112, task='text-matching')
27
- text_embeddings = model.encode_texts(texts=texts, truncate_dim=256, max_length=512, task='text-matching')
28
 
29
- # Example 2: Retrieval task with multi-vector embeddings
30
- model.set_task(task='retrieval')
31
 
32
- # Generate multi-vector embeddings
33
- img_embeddings = model.encode_images(images=images, vector_type='multi_vector')
34
- text_embeddings = model.encode_texts(texts=texts, vector_type='multi_vector', prompt_name='passage')
 
35
 
36
- # Example 3: Code task with single vector embeddings
37
- code = ["def hello_world():\n print('Hello, World!')"]
38
- code_embeddings = model.encode_texts(texts=code, task='code')
39
 
40
- ```
 
 
 
 
41
 
42
- Using the model forward:
43
 
44
- ```python
45
- import torch
46
- from transformers import AutoModel, AutoProcessor
47
- from PIL import Image
 
 
 
 
 
 
 
 
 
48
 
49
- device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
50
 
51
- # Load model and processor
52
- model = AutoModel.from_pretrained('jinaai/jina-embeddings-v4', trust_remote_code=True)
53
- model = model.to(device)
54
- processor = AutoProcessor.from_pretrained('jinaai/jina-embeddings-v4', trust_remote_code=True)
55
 
 
56
 
57
- # Sample data
58
- texts = ["Here is some sample code", "This is a matching text"]
59
- image_paths = ['/<path_to_image>']
60
 
61
- # Process text and images
62
- text_batch = processor.process_texts(texts=texts, prefix="Query", max_length=512)
63
- images = [Image.open(path) for path in image_paths]
64
- image_batch = processor.process_images(images=images)
65
 
66
- # Forward pass
67
- model.eval()
68
- with torch.no_grad():
69
- text_batch = {k: v.to(device) for k, v in text_batch.items()}
70
- image_batch = {k: v.to(device) for k, v in image_batch.items()}
71
-
72
- with torch.autocast(device_type='cuda' if torch.cuda.is_available() else 'cpu'):
73
- # Get embeddings
74
- text_embeddings = model.model(**text_batch, task_label='retrieval').single_vec_emb
75
- img_embeddings = model.model(**image_batch, task_label='retrieval').single_vec_emb
 
 
 
76
 
77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  ```
79
 
 
80
 
81
- Inference via the `SentenceTransformer` library:
 
82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
  ```python
84
  from sentence_transformers import SentenceTransformer
85
 
86
- model = SentenceTransformer(
87
- 'jinaai/jina-embeddings-v4', trust_remote_code=True
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
  )
89
 
90
- emb = model.encode(['Khinkali is the best'], task='retrieval', prompt_name='query')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
 
92
- ```
 
1
+ <br><br>
2
 
3
+ <p align="center">
4
+ <img src="https://huggingface.co/datasets/jinaai/documentation-images/resolve/main/logo.webp" alt="Jina AI: Your Search Foundation, Supercharged!" width="150px">
5
+ </p>
6
 
 
7
 
8
+ <p align="center">
9
+ <b>The embedding model trained by <a href="https://jina.ai/"><b>Jina AI</b></a>.</b>
10
+ </p>
11
 
12
+ <p align="center">
13
+ <b>Jina Embeddings v4: Multilingual Multimodal Embeddings</b>
14
+ </p>
 
15
 
16
+ This model is based on the paper [jina-embeddings-v4: Multilingual Multimodal Embeddings](https://puginarug.com/).
17
 
 
 
 
18
 
19
+ ## Quick Start
 
 
 
20
 
21
+ [Blog](https://alwaysjudgeabookbyitscover.com/) | [Technical Report](https://puginarug.com) | [API](https://jina.ai/embeddings)
 
 
 
22
 
 
 
23
 
24
+ ## Intended Usage & Model Info
25
+ `jina-embeddings-v4` is a multilingual, multimodal embedding model designed for unified representation of text and images.
26
+ The model is specialized for complex document retrieval, including visually rich documents with charts, tables, and illustrations.
27
+ Embeddings produced by `jina-embeddings-v4` serve as the backbone for neural information retrieval and multimodal GenAI applications.
28
 
 
 
 
29
 
30
+ Built based on [Qwen/Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct), `jina-embeddings-v4` has the following features:
31
+ - **Unified embeddings** for text, images, and documents, supporting both dense (single-vector) and late-interaction (multi-vector) retrieval.
32
+ - **Multilingual support** (20+ languages) and compatibility with a wide range of domains, including technical and visually complex documents.
33
+ - **Task-specific adapters** for retrieval, text matching, and code-related tasks, which can be selected at inference time.
34
+ - **Flexible embedding size**: dense embeddings are 2048 dimensions by default but can be truncated to as low as 128 with minimal performance loss.
35
 
 
36
 
37
+ Summary of features:
38
+ | Feature | Jina Embeddings V4 |
39
+ |------------|------------|
40
+ | Base Model | Qwen2.5-VL-3B-Instruct |
41
+ | Supported Tasks | Retrieval, Text Matching, Code |
42
+ | Model DType | BFloat 16 |
43
+ | Max Sequence Length | 8192 |
44
+ | Single-Vector Dimension | 2048 |
45
+ | Multi-Vector Dimension | 128 |
46
+ | Matryoshka dimensions | 128, 256, 512, 1024, 2048 |
47
+ | Attention Mechanism | FlashAttention2 |
48
+ | Pooling Strategy | Mean pooling |
49
+
50
 
 
51
 
52
+ ## Training, Data, Parameters
 
 
 
53
 
54
+ Please refer to our [technical report of jina-embeddings-v4](https://puginarug.com) for the model and training details.
55
 
 
 
 
56
 
57
+ ## Usage
 
 
 
58
 
59
+ <details>
60
+ <summary>Requirements</a></summary>
61
+
62
+ The following Python packages are required:
63
+ - `transformers>=4.52.0`
64
+ - `torch>=2.6.0`
65
+ - `peft>=0.15.2`
66
+ - `torchvision`
67
+ - `pillow`
68
+
69
+ ### Optional / Recommended
70
+ - **flash-attention**: Installing [flash-attention](https://github.com/Dao-AILab/flash-attention) is recommended for improved inference speed and efficiency, but not mandatory.
71
+ - **sentence-transformers**: If you want to use the model via the `sentence-transformers` interface, install this package as well.
72
 
73
 
74
+ </details>
75
+
76
+
77
+ <details>
78
+ <summary>via Jina AI <a href="https://jina.ai/embeddings/">Embedding API</a></summary>
79
+
80
+ Needs to be adjusted for V4
81
+ ```bash
82
+ curl https://api.jina.ai/v1/embeddings \
83
+ -H "Content-Type: application/json" \
84
+ -H "Authorization: Bearer [JINA_AI_API_TOKEN]" \
85
+ -d @- <<EOFEOF
86
+ {
87
+ "model": "jina-embeddings-v4",
88
+ "dimensions": 1024,
89
+ "task": "retrieval.query",
90
+ "normalized": true,
91
+ "embedding_type": "float",
92
+ "input": [
93
+ {
94
+ "text": "غروب جميل على الشاطئ"
95
+ },
96
+ {
97
+ "text": "海滩上美丽的日落"
98
+ },
99
+ {
100
+ "text": "A beautiful sunset over the beach"
101
+ },
102
+ {
103
+ "text": "Un beau coucher de soleil sur la plage"
104
+ },
105
+ {
106
+ "text": "Ein wunderschöner Sonnenuntergang am Strand"
107
+ },
108
+ {
109
+ "text": "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία"
110
+ },
111
+ {
112
+ "text": "समुद्र तट पर एक खूबसूरत सूर्यास्त"
113
+ },
114
+ {
115
+ "text": "Un bellissimo tramonto sulla spiaggia"
116
+ },
117
+ {
118
+ "text": "浜辺に沈む美しい夕日"
119
+ },
120
+ {
121
+ "text": "해변 위로 아름다운 일몰"
122
+ },
123
+ {
124
+ "image": "https://i.ibb.co/nQNGqL0/beach1.jpg"
125
+ },
126
+ {
127
+ "image": "https://i.ibb.co/r5w8hG8/beach2.jpg"
128
+ }
129
+ ]
130
+ }
131
+ EOFEOF
132
  ```
133
 
134
+ </details>
135
 
136
+ <details>
137
+ <summary>via <a href="https://huggingface.co/docs/transformers/en/index">transformers</a></summary>
138
 
139
+ ```python
140
+ # !pip install transformers>=4.52.0 torch>=2.6.0 peft>=0.15.2 torchvision pillow
141
+ # !pip install
142
+ from transformers import AutoModel
143
+
144
+ # Initialize the model
145
+ model = AutoModel.from_pretrained("jinaai/jina-embeddings-v4", trust_remote_code=True)
146
+ # ========================
147
+ # 1. Retrieval Task
148
+ # ========================
149
+ # Configure truncate_dim, max_length (for texts), max_pixels (for images), vector_type, batch_size in the encode function if needed
150
+
151
+ # Encode query
152
+ query_embedding = model.encode_texts(
153
+ texts=["Overview of climate change impacts on coastal cities"],
154
+ task="retrieval",
155
+ prompt_name="query",
156
+ )[0]
157
+
158
+ # Encode passage (text)
159
+ passage_embedding = model.encode_texts(
160
+ texts=[
161
+ "Climate change has led to rising sea levels, increased frequency of extreme weather events..."
162
+ ],
163
+ task="retrieval",
164
+ prompt_name="passage",
165
+ )[0]
166
+
167
+ # Encode image/document
168
+ image_embedding = model.encode_images(
169
+ images=["https://i.ibb.co/nQNGqL0/beach1.jpg"],
170
+ task="retrieval",
171
+ )[0]
172
+
173
+ # ========================
174
+ # 2. Text Matching Task
175
+ # ========================
176
+ texts = [
177
+ "غروب جميل على الشاطئ", # Arabic
178
+ "海滩上美丽的日落", # Chinese
179
+ "Un beau coucher de soleil sur la plage", # French
180
+ "Ein wunderschöner Sonnenuntergang am Strand", # German
181
+ "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία", # Greek
182
+ "समुद्र तट पर एक खूबसूरत सूर्यास्त", # Hindi
183
+ "Un bellissimo tramonto sulla spiaggia", # Italian
184
+ "浜辺に沈む美しい夕日", # Japanese
185
+ "해변 위로 아름다운 일몰", # Korean
186
+ ]
187
+
188
+ text_embeddings = model.encode_texts(texts=texts, task="text-matching")
189
+
190
+ # ========================
191
+ # 3. Code Understanding Task
192
+ # ========================
193
+
194
+ # Encode query
195
+ query_embedding = model.encode_texts(
196
+ texts=["Find a function that prints a greeting message to the console"],
197
+ task="code",
198
+ prompt_name="query",
199
+ )
200
+
201
+ # Encode code
202
+ code_embeddings = model.encode_texts(
203
+ texts=["def hello_world():\n print('Hello, World!')"],
204
+ task="code",
205
+ prompt_name="passage",
206
+ )
207
+ ```
208
+ </details>
209
+
210
+ <details>
211
+ <summary>via <a href="https://sbert.net/">sentence-transformers</a></summary>
212
+
213
  ```python
214
  from sentence_transformers import SentenceTransformer
215
 
216
+ # Initialize the model
217
+ model = SentenceTransformer("jinaai/jina-embeddings-v4", trust_remote_code=True)
218
+ # ========================
219
+ # 1. Retrieval Task
220
+ # ========================
221
+ # Encode query
222
+ query_embedding = model.encode(
223
+ sentences=["Overview of climate change impacts on coastal cities"],
224
+ task="retrieval",
225
+ prompt_name="query",
226
+ )[0]
227
+
228
+ # Encode passage (text)
229
+ passage_embedding = model.encode(
230
+ sentences=[
231
+ "Climate change has led to rising sea levels, increased frequency of extreme weather events..."
232
+ ],
233
+ task="retrieval",
234
+ prompt_name="passage",
235
+ )[0]
236
+
237
+ # Encode image/document
238
+ image_embedding = model.encode(
239
+ sentences=["https://i.ibb.co/nQNGqL0/beach1.jpg"],
240
+ task="retrieval",
241
+ )[0]
242
+
243
+ # ========================
244
+ # 2. Text Matching Task
245
+ # ========================
246
+ texts = [
247
+ "غروب جميل على الشاطئ", # Arabic
248
+ "海滩上美丽的日落", # Chinese
249
+ "Un beau coucher de soleil sur la plage", # French
250
+ "Ein wunderschöner Sonnenuntergang am Strand", # German
251
+ "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία", # Greek
252
+ "समुद्र तट पर एक खूबसूरत सूर्यास्त", # Hindi
253
+ "Un bellissimo tramonto sulla spiaggia", # Italian
254
+ "浜辺に沈む美しい夕日", # Japanese
255
+ "해변 위로 아름다운 일몰", # Korean
256
+ ]
257
+
258
+ text_embeddings = model.encode(sentences=texts, task="text-matching")
259
+
260
+ # ========================
261
+ # 3. Code Understanding Task
262
+ # ========================
263
+
264
+ # Encode query
265
+ query_embedding = model.encode(
266
+ sentences=["Find a function that prints a greeting message to the console"],
267
+ task="code",
268
+ prompt_name="query",
269
  )
270
 
271
+ # Encode code
272
+ code_embeddings = model.encode(
273
+ sentences=["def hello_world():\n print('Hello, World!')"],
274
+ task="code",
275
+ prompt_name="passage",
276
+ )
277
+ ```
278
+ </details>
279
+
280
+
281
+ ## License
282
+
283
+ This model is licensed to download and run under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en). It is available for commercial use via the [Jina Embeddings API](https://jina.ai/embeddings/), [AWS](https://longdogechallenge.com/), [Azure](https://longdogechallenge.com/), and [GCP](https://longdogechallenge.com/). To download for commercial use, please [contact us](https://jina.ai/contact-sales).
284
+
285
+
286
+ ## Contact
287
+
288
+ Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.
289
+
290
+
291
+ ## Citation
292
 
293
+ If you find `jina-embeddings-v4` useful in your research, please cite the following paper: