noystl commited on
Commit
ba0f36e
·
verified ·
1 Parent(s): 4a2996f

Upload 11 files

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,762 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:784827
8
+ - loss:ContrastiveLoss
9
+ base_model: BAAI/bge-large-en-v1.5
10
+ widget:
11
+ - source_sentence: 'Represent this sentence for searching relevant passages: Existing
12
+ methods for anomaly detection on dynamic graphs struggle with capturing complex
13
+ time information in graph structures and generating effective negative samples
14
+ for unsupervised learning. These challenges highlight the need for improved methodologies
15
+ that can address the limitations of current approaches in this field.We suggest
16
+ combining ''a message-passing framework'' and '
17
+ sentences:
18
+ - a single global model
19
+ - videos
20
+ - sequential polygon generation
21
+ - source_sentence: 'Represent this sentence for searching relevant passages: The study
22
+ addresses the need for effective tools that allow both novice and expert users
23
+ to analyze the diversity of news coverage about events. It highlights the importance
24
+ of tailoring the interface to accommodate non-expert users while also considering
25
+ the insights of journalism-savvy users, indicating a gap in existing systems that
26
+ cater to varying levels of expertise in news analysis.We suggest combining ''a
27
+ coordinated visualization interface tailored for visualization non-expert users''
28
+ and '
29
+ sentences:
30
+ - worst-case resource analysis
31
+ - Graph Convolution Networks
32
+ - a text encoder
33
+ - source_sentence: 'Represent this sentence for searching relevant passages: The accuracy
34
+ of pixel flows is crucial for achieving high-quality video enhancement, yet most
35
+ prior works focus on estimating dense flows that are generally less robust and
36
+ computationally expensive. This highlights a gap in existing methodologies that
37
+ fail to prioritize accuracy over density, necessitating a more efficient approach
38
+ to flow estimation for video enhancement tasks.We suggest combining ''sparse point
39
+ cloud data'' and '
40
+ sentences:
41
+ - a Temporal Eigenvalue Loss
42
+ - diffusion models
43
+ - explicit 3D representations, such as polygonal meshes
44
+ - source_sentence: 'Represent this sentence for searching relevant passages: The traditional
45
+ frame of discernment lacks a crucial factor, the sequence of propositions, which
46
+ limits the effectiveness of existing methods to measure uncertainty. This gap
47
+ highlights the need for a more comprehensive approach that can better represent
48
+ the relationships between the elements of the frame of discernment.We suggest
49
+ ''combine the order of propositions and the mass of them'' inspired by '
50
+ sentences:
51
+ - the traditional matching-optimization methods where matching is introduced to
52
+ handle large displacements before energy-based optimizations
53
+ - encoder-decoder models
54
+ - In another vein, researchers propose new attention augmentation methods to make
55
+ transformers more accurate, efficient and interpretable
56
+ - source_sentence: 'Represent this sentence for searching relevant passages: The study
57
+ addresses the need for effective time series forecasting methods to estimate the
58
+ spread of epidemics, particularly in light of the resurgence of COVID-19 cases.
59
+ It highlights the importance of accurately modeling both linear and non-linear
60
+ features of epidemic data to provide state authorities and health officials with
61
+ reliable short-term forecasts and strategies.We suggest combining ''ARIMA'' and '
62
+ sentences:
63
+ - Transformers
64
+ - a traditional feature-mixed branch
65
+ - the human brain is able to efficiently learn effective control strategies using
66
+ limited resources
67
+ pipeline_tag: sentence-similarity
68
+ library_name: sentence-transformers
69
+ ---
70
+
71
+ # SentenceTransformer based on BAAI/bge-large-en-v1.5
72
+
73
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
74
+
75
+ ## Model Details
76
+
77
+ ### Model Description
78
+ - **Model Type:** Sentence Transformer
79
+ - **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) <!-- at revision d4aa6901d3a41ba39fb536a557fa166f842b0e09 -->
80
+ - **Maximum Sequence Length:** 512 tokens
81
+ - **Output Dimensionality:** 1024 dimensions
82
+ - **Similarity Function:** Cosine Similarity
83
+ <!-- - **Training Dataset:** Unknown -->
84
+ <!-- - **Language:** Unknown -->
85
+ <!-- - **License:** Unknown -->
86
+
87
+ ### Model Sources
88
+
89
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
90
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
91
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
92
+
93
+ ### Full Model Architecture
94
+
95
+ ```
96
+ SentenceTransformer(
97
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
98
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
99
+ (2): Normalize()
100
+ )
101
+ ```
102
+
103
+ ## Usage
104
+
105
+ ### Direct Usage (Sentence Transformers)
106
+
107
+ First install the Sentence Transformers library:
108
+
109
+ ```bash
110
+ pip install -U sentence-transformers
111
+ ```
112
+
113
+ Then you can load this model and run inference.
114
+ ```python
115
+ from sentence_transformers import SentenceTransformer
116
+
117
+ # Download from the 🤗 Hub
118
+ model = SentenceTransformer("sentence_transformers_model_id")
119
+ # Run inference
120
+ sentences = [
121
+ "Represent this sentence for searching relevant passages: The study addresses the need for effective time series forecasting methods to estimate the spread of epidemics, particularly in light of the resurgence of COVID-19 cases. It highlights the importance of accurately modeling both linear and non-linear features of epidemic data to provide state authorities and health officials with reliable short-term forecasts and strategies.We suggest combining 'ARIMA' and ",
122
+ 'Transformers',
123
+ 'the human brain is able to efficiently learn effective control strategies using limited resources',
124
+ ]
125
+ embeddings = model.encode(sentences)
126
+ print(embeddings.shape)
127
+ # [3, 1024]
128
+
129
+ # Get the similarity scores for the embeddings
130
+ similarities = model.similarity(embeddings, embeddings)
131
+ print(similarities.shape)
132
+ # [3, 3]
133
+ ```
134
+
135
+ <!--
136
+ ### Direct Usage (Transformers)
137
+
138
+ <details><summary>Click to see the direct usage in Transformers</summary>
139
+
140
+ </details>
141
+ -->
142
+
143
+ <!--
144
+ ### Downstream Usage (Sentence Transformers)
145
+
146
+ You can finetune this model on your own dataset.
147
+
148
+ <details><summary>Click to expand</summary>
149
+
150
+ </details>
151
+ -->
152
+
153
+ <!--
154
+ ### Out-of-Scope Use
155
+
156
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
157
+ -->
158
+
159
+ <!--
160
+ ## Bias, Risks and Limitations
161
+
162
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
163
+ -->
164
+
165
+ <!--
166
+ ### Recommendations
167
+
168
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
169
+ -->
170
+
171
+ ## Training Details
172
+
173
+ ### Training Dataset
174
+
175
+ #### Unnamed Dataset
176
+
177
+
178
+ * Size: 784,827 training samples
179
+ * Columns: <code>query</code>, <code>answer</code>, and <code>label</code>
180
+ * Approximate statistics based on the first 1000 samples:
181
+ | | query | answer | label |
182
+ |:--------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:-----------------------------------------------|
183
+ | type | string | string | int |
184
+ | details | <ul><li>min: 66 tokens</li><li>mean: 83.86 tokens</li><li>max: 99 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 8.63 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>0: ~96.70%</li><li>1: ~3.30%</li></ul> |
185
+ * Samples:
186
+ | query | answer | label |
187
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------|:---------------|
188
+ | <code>Represent this sentence for searching relevant passages: The study addresses the challenge of action segmentation under weak supervision, where the available ground truth only indicates the presence of actions without providing their temporal ordering or occurrence timing in training videos. This limitation necessitates the development of a method to generate pseudo-ground truth for effective training and improve performance in action segmentation and alignment tasks.We suggest combining 'a Hidden Markov Model' and </code> | <code>a multilayer perceptron</code> | <code>1</code> |
189
+ | <code>Represent this sentence for searching relevant passages: The study addresses the challenge of action segmentation under weak supervision, where the available ground truth only indicates the presence of actions without providing their temporal ordering or occurrence timing in training videos. This limitation necessitates the development of a method to generate pseudo-ground truth for effective training and improve performance in action segmentation and alignment tasks.We suggest combining 'a Hidden Markov Model' and </code> | <code>global expression information</code> | <code>0</code> |
190
+ | <code>Represent this sentence for searching relevant passages: The study addresses the challenge of action segmentation under weak supervision, where the available ground truth only indicates the presence of actions without providing their temporal ordering or occurrence timing in training videos. This limitation necessitates the development of a method to generate pseudo-ground truth for effective training and improve performance in action segmentation and alignment tasks.We suggest combining 'a Hidden Markov Model' and </code> | <code>some relevant physical parameters</code> | <code>0</code> |
191
+ * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
192
+ ```json
193
+ {
194
+ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
195
+ "margin": 0.5,
196
+ "size_average": true
197
+ }
198
+ ```
199
+
200
+ ### Training Hyperparameters
201
+ #### Non-Default Hyperparameters
202
+
203
+ - `per_device_train_batch_size`: 64
204
+ - `learning_rate`: 2.3351317368662443e-06
205
+ - `warmup_ratio`: 0.11883406097525227
206
+ - `bf16`: True
207
+ - `prompts`: {'query': 'Represent this sentence for searching relevant passages: '}
208
+ - `batch_sampler`: no_duplicates
209
+
210
+ #### All Hyperparameters
211
+ <details><summary>Click to expand</summary>
212
+
213
+ - `overwrite_output_dir`: False
214
+ - `do_predict`: False
215
+ - `eval_strategy`: no
216
+ - `prediction_loss_only`: True
217
+ - `per_device_train_batch_size`: 64
218
+ - `per_device_eval_batch_size`: 8
219
+ - `per_gpu_train_batch_size`: None
220
+ - `per_gpu_eval_batch_size`: None
221
+ - `gradient_accumulation_steps`: 1
222
+ - `eval_accumulation_steps`: None
223
+ - `torch_empty_cache_steps`: None
224
+ - `learning_rate`: 2.3351317368662443e-06
225
+ - `weight_decay`: 0.0
226
+ - `adam_beta1`: 0.9
227
+ - `adam_beta2`: 0.999
228
+ - `adam_epsilon`: 1e-08
229
+ - `max_grad_norm`: 1.0
230
+ - `num_train_epochs`: 3
231
+ - `max_steps`: -1
232
+ - `lr_scheduler_type`: linear
233
+ - `lr_scheduler_kwargs`: {}
234
+ - `warmup_ratio`: 0.11883406097525227
235
+ - `warmup_steps`: 0
236
+ - `log_level`: passive
237
+ - `log_level_replica`: warning
238
+ - `log_on_each_node`: True
239
+ - `logging_nan_inf_filter`: True
240
+ - `save_safetensors`: True
241
+ - `save_on_each_node`: False
242
+ - `save_only_model`: False
243
+ - `restore_callback_states_from_checkpoint`: False
244
+ - `no_cuda`: False
245
+ - `use_cpu`: False
246
+ - `use_mps_device`: False
247
+ - `seed`: 42
248
+ - `data_seed`: None
249
+ - `jit_mode_eval`: False
250
+ - `use_ipex`: False
251
+ - `bf16`: True
252
+ - `fp16`: False
253
+ - `fp16_opt_level`: O1
254
+ - `half_precision_backend`: auto
255
+ - `bf16_full_eval`: False
256
+ - `fp16_full_eval`: False
257
+ - `tf32`: None
258
+ - `local_rank`: 0
259
+ - `ddp_backend`: None
260
+ - `tpu_num_cores`: None
261
+ - `tpu_metrics_debug`: False
262
+ - `debug`: []
263
+ - `dataloader_drop_last`: False
264
+ - `dataloader_num_workers`: 0
265
+ - `dataloader_prefetch_factor`: None
266
+ - `past_index`: -1
267
+ - `disable_tqdm`: False
268
+ - `remove_unused_columns`: True
269
+ - `label_names`: None
270
+ - `load_best_model_at_end`: False
271
+ - `ignore_data_skip`: False
272
+ - `fsdp`: []
273
+ - `fsdp_min_num_params`: 0
274
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
275
+ - `fsdp_transformer_layer_cls_to_wrap`: None
276
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
277
+ - `deepspeed`: None
278
+ - `label_smoothing_factor`: 0.0
279
+ - `optim`: adamw_torch
280
+ - `optim_args`: None
281
+ - `adafactor`: False
282
+ - `group_by_length`: False
283
+ - `length_column_name`: length
284
+ - `ddp_find_unused_parameters`: None
285
+ - `ddp_bucket_cap_mb`: None
286
+ - `ddp_broadcast_buffers`: False
287
+ - `dataloader_pin_memory`: True
288
+ - `dataloader_persistent_workers`: False
289
+ - `skip_memory_metrics`: True
290
+ - `use_legacy_prediction_loop`: False
291
+ - `push_to_hub`: False
292
+ - `resume_from_checkpoint`: None
293
+ - `hub_model_id`: None
294
+ - `hub_strategy`: every_save
295
+ - `hub_private_repo`: None
296
+ - `hub_always_push`: False
297
+ - `gradient_checkpointing`: False
298
+ - `gradient_checkpointing_kwargs`: None
299
+ - `include_inputs_for_metrics`: False
300
+ - `include_for_metrics`: []
301
+ - `eval_do_concat_batches`: True
302
+ - `fp16_backend`: auto
303
+ - `push_to_hub_model_id`: None
304
+ - `push_to_hub_organization`: None
305
+ - `mp_parameters`:
306
+ - `auto_find_batch_size`: False
307
+ - `full_determinism`: False
308
+ - `torchdynamo`: None
309
+ - `ray_scope`: last
310
+ - `ddp_timeout`: 1800
311
+ - `torch_compile`: False
312
+ - `torch_compile_backend`: None
313
+ - `torch_compile_mode`: None
314
+ - `dispatch_batches`: None
315
+ - `split_batches`: None
316
+ - `include_tokens_per_second`: False
317
+ - `include_num_input_tokens_seen`: False
318
+ - `neftune_noise_alpha`: None
319
+ - `optim_target_modules`: None
320
+ - `batch_eval_metrics`: False
321
+ - `eval_on_start`: False
322
+ - `use_liger_kernel`: False
323
+ - `eval_use_gather_object`: False
324
+ - `average_tokens_across_devices`: False
325
+ - `prompts`: {'query': 'Represent this sentence for searching relevant passages: '}
326
+ - `batch_sampler`: no_duplicates
327
+ - `multi_dataset_batch_sampler`: proportional
328
+
329
+ </details>
330
+
331
+ ### Training Logs
332
+ <details><summary>Click to expand</summary>
333
+
334
+ | Epoch | Step | Training Loss |
335
+ |:------:|:-----:|:-------------:|
336
+ | 0.0082 | 100 | 0.0051 |
337
+ | 0.0163 | 200 | 0.0038 |
338
+ | 0.0245 | 300 | 0.0037 |
339
+ | 0.0326 | 400 | 0.0036 |
340
+ | 0.0408 | 500 | 0.0046 |
341
+ | 0.0489 | 600 | 0.0035 |
342
+ | 0.0571 | 700 | 0.0035 |
343
+ | 0.0652 | 800 | 0.0034 |
344
+ | 0.0734 | 900 | 0.0044 |
345
+ | 0.0815 | 1000 | 0.0034 |
346
+ | 0.0897 | 1100 | 0.0035 |
347
+ | 0.0979 | 1200 | 0.0034 |
348
+ | 0.1060 | 1300 | 0.0034 |
349
+ | 0.1142 | 1400 | 0.0045 |
350
+ | 0.1223 | 1500 | 0.0034 |
351
+ | 0.1305 | 1600 | 0.0034 |
352
+ | 0.1386 | 1700 | 0.0033 |
353
+ | 0.1468 | 1800 | 0.0043 |
354
+ | 0.1549 | 1900 | 0.0034 |
355
+ | 0.1631 | 2000 | 0.0033 |
356
+ | 0.1712 | 2100 | 0.0032 |
357
+ | 0.1794 | 2200 | 0.0033 |
358
+ | 0.1876 | 2300 | 0.0044 |
359
+ | 0.1957 | 2400 | 0.0033 |
360
+ | 0.2039 | 2500 | 0.0034 |
361
+ | 0.2120 | 2600 | 0.0033 |
362
+ | 0.2202 | 2700 | 0.0042 |
363
+ | 0.2283 | 2800 | 0.0034 |
364
+ | 0.2365 | 2900 | 0.0033 |
365
+ | 0.2446 | 3000 | 0.0033 |
366
+ | 0.2528 | 3100 | 0.0036 |
367
+ | 0.2609 | 3200 | 0.0039 |
368
+ | 0.2691 | 3300 | 0.0033 |
369
+ | 0.2773 | 3400 | 0.0032 |
370
+ | 0.2854 | 3500 | 0.0034 |
371
+ | 0.2936 | 3600 | 0.0041 |
372
+ | 0.3017 | 3700 | 0.0031 |
373
+ | 0.3099 | 3800 | 0.0032 |
374
+ | 0.3180 | 3900 | 0.0031 |
375
+ | 0.3262 | 4000 | 0.004 |
376
+ | 0.3343 | 4100 | 0.0034 |
377
+ | 0.3425 | 4200 | 0.003 |
378
+ | 0.3506 | 4300 | 0.0032 |
379
+ | 0.3588 | 4400 | 0.0032 |
380
+ | 0.3670 | 4500 | 0.004 |
381
+ | 0.3751 | 4600 | 0.0031 |
382
+ | 0.3833 | 4700 | 0.0033 |
383
+ | 0.3914 | 4800 | 0.0031 |
384
+ | 0.3996 | 4900 | 0.004 |
385
+ | 0.4077 | 5000 | 0.0032 |
386
+ | 0.4159 | 5100 | 0.0031 |
387
+ | 0.4240 | 5200 | 0.0031 |
388
+ | 0.4322 | 5300 | 0.0031 |
389
+ | 0.4403 | 5400 | 0.0039 |
390
+ | 0.4485 | 5500 | 0.003 |
391
+ | 0.4567 | 5600 | 0.003 |
392
+ | 0.4648 | 5700 | 0.0031 |
393
+ | 0.4730 | 5800 | 0.0038 |
394
+ | 0.4811 | 5900 | 0.0031 |
395
+ | 0.4893 | 6000 | 0.0032 |
396
+ | 0.4974 | 6100 | 0.0031 |
397
+ | 0.5056 | 6200 | 0.0033 |
398
+ | 0.5137 | 6300 | 0.0035 |
399
+ | 0.5219 | 6400 | 0.0031 |
400
+ | 0.5300 | 6500 | 0.0031 |
401
+ | 0.5382 | 6600 | 0.0031 |
402
+ | 0.5464 | 6700 | 0.0038 |
403
+ | 0.5545 | 6800 | 0.0031 |
404
+ | 0.5627 | 6900 | 0.003 |
405
+ | 0.5708 | 7000 | 0.0029 |
406
+ | 0.5790 | 7100 | 0.0037 |
407
+ | 0.5871 | 7200 | 0.0033 |
408
+ | 0.5953 | 7300 | 0.0031 |
409
+ | 0.6034 | 7400 | 0.003 |
410
+ | 0.6116 | 7500 | 0.003 |
411
+ | 0.6198 | 7600 | 0.004 |
412
+ | 0.6279 | 7700 | 0.0031 |
413
+ | 0.6361 | 7800 | 0.0031 |
414
+ | 0.6442 | 7900 | 0.0031 |
415
+ | 0.6524 | 8000 | 0.0039 |
416
+ | 0.6605 | 8100 | 0.0029 |
417
+ | 0.6687 | 8200 | 0.003 |
418
+ | 0.6768 | 8300 | 0.0029 |
419
+ | 0.6850 | 8400 | 0.0028 |
420
+ | 0.6931 | 8500 | 0.0036 |
421
+ | 0.7013 | 8600 | 0.0031 |
422
+ | 0.7095 | 8700 | 0.0029 |
423
+ | 0.7176 | 8800 | 0.0028 |
424
+ | 0.7258 | 8900 | 0.0035 |
425
+ | 0.7339 | 9000 | 0.0033 |
426
+ | 0.7421 | 9100 | 0.003 |
427
+ | 0.7502 | 9200 | 0.0028 |
428
+ | 0.7584 | 9300 | 0.0029 |
429
+ | 0.7665 | 9400 | 0.0035 |
430
+ | 0.7747 | 9500 | 0.003 |
431
+ | 0.7828 | 9600 | 0.0028 |
432
+ | 0.7910 | 9700 | 0.0027 |
433
+ | 0.7992 | 9800 | 0.0034 |
434
+ | 0.8073 | 9900 | 0.0032 |
435
+ | 0.8155 | 10000 | 0.003 |
436
+ | 0.8236 | 10100 | 0.0029 |
437
+ | 0.8318 | 10200 | 0.0032 |
438
+ | 0.8399 | 10300 | 0.0032 |
439
+ | 0.8481 | 10400 | 0.003 |
440
+ | 0.8562 | 10500 | 0.0029 |
441
+ | 0.8644 | 10600 | 0.0029 |
442
+ | 0.8725 | 10700 | 0.0033 |
443
+ | 0.8807 | 10800 | 0.003 |
444
+ | 0.8889 | 10900 | 0.0029 |
445
+ | 0.8970 | 11000 | 0.0028 |
446
+ | 0.9052 | 11100 | 0.0035 |
447
+ | 0.9133 | 11200 | 0.003 |
448
+ | 0.9215 | 11300 | 0.0029 |
449
+ | 0.9296 | 11400 | 0.0029 |
450
+ | 0.9378 | 11500 | 0.0029 |
451
+ | 0.9459 | 11600 | 0.0034 |
452
+ | 0.9541 | 11700 | 0.0031 |
453
+ | 0.9622 | 11800 | 0.0028 |
454
+ | 0.9704 | 11900 | 0.003 |
455
+ | 0.9786 | 12000 | 0.0035 |
456
+ | 0.9867 | 12100 | 0.0032 |
457
+ | 0.9949 | 12200 | 0.003 |
458
+ | 1.0030 | 12300 | 0.0033 |
459
+ | 1.0112 | 12400 | 0.0029 |
460
+ | 1.0193 | 12500 | 0.003 |
461
+ | 1.0275 | 12600 | 0.0029 |
462
+ | 1.0356 | 12700 | 0.0036 |
463
+ | 1.0438 | 12800 | 0.003 |
464
+ | 1.0519 | 12900 | 0.0027 |
465
+ | 1.0601 | 13000 | 0.0028 |
466
+ | 1.0683 | 13100 | 0.0028 |
467
+ | 1.0764 | 13200 | 0.0036 |
468
+ | 1.0846 | 13300 | 0.0027 |
469
+ | 1.0927 | 13400 | 0.0028 |
470
+ | 1.1009 | 13500 | 0.0029 |
471
+ | 1.1090 | 13600 | 0.0037 |
472
+ | 1.1172 | 13700 | 0.0029 |
473
+ | 1.1253 | 13800 | 0.0029 |
474
+ | 1.1335 | 13900 | 0.0027 |
475
+ | 1.1416 | 14000 | 0.0033 |
476
+ | 1.1498 | 14100 | 0.003 |
477
+ | 1.1580 | 14200 | 0.0027 |
478
+ | 1.1661 | 14300 | 0.0028 |
479
+ | 1.1743 | 14400 | 0.0026 |
480
+ | 1.1824 | 14500 | 0.0036 |
481
+ | 1.1906 | 14600 | 0.0028 |
482
+ | 1.1987 | 14700 | 0.0027 |
483
+ | 1.2069 | 14800 | 0.0029 |
484
+ | 1.2150 | 14900 | 0.0035 |
485
+ | 1.2232 | 15000 | 0.0027 |
486
+ | 1.2313 | 15100 | 0.0027 |
487
+ | 1.2395 | 15200 | 0.0027 |
488
+ | 1.2477 | 15300 | 0.0028 |
489
+ | 1.2558 | 15400 | 0.0035 |
490
+ | 1.2640 | 15500 | 0.0027 |
491
+ | 1.2721 | 15600 | 0.0027 |
492
+ | 1.2803 | 15700 | 0.0027 |
493
+ | 1.2884 | 15800 | 0.0037 |
494
+ | 1.2966 | 15900 | 0.0027 |
495
+ | 1.3047 | 16000 | 0.0027 |
496
+ | 1.3129 | 16100 | 0.0027 |
497
+ | 1.3210 | 16200 | 0.0028 |
498
+ | 1.3292 | 16300 | 0.0033 |
499
+ | 1.3374 | 16400 | 0.0026 |
500
+ | 1.3455 | 16500 | 0.0025 |
501
+ | 1.3537 | 16600 | 0.0028 |
502
+ | 1.3618 | 16700 | 0.0034 |
503
+ | 1.3700 | 16800 | 0.0027 |
504
+ | 1.3781 | 16900 | 0.0026 |
505
+ | 1.3863 | 17000 | 0.0027 |
506
+ | 1.3944 | 17100 | 0.0033 |
507
+ | 1.4026 | 17200 | 0.0027 |
508
+ | 1.4107 | 17300 | 0.0027 |
509
+ | 1.4189 | 17400 | 0.0026 |
510
+ | 1.4271 | 17500 | 0.0027 |
511
+ | 1.4352 | 17600 | 0.0034 |
512
+ | 1.4434 | 17700 | 0.0027 |
513
+ | 1.4515 | 17800 | 0.0025 |
514
+ | 1.4597 | 17900 | 0.0027 |
515
+ | 1.4678 | 18000 | 0.0031 |
516
+ | 1.4760 | 18100 | 0.0027 |
517
+ | 1.4841 | 18200 | 0.0027 |
518
+ | 1.4923 | 18300 | 0.0027 |
519
+ | 1.5004 | 18400 | 0.0027 |
520
+ | 1.5086 | 18500 | 0.0031 |
521
+ | 1.5168 | 18600 | 0.0025 |
522
+ | 1.5249 | 18700 | 0.0026 |
523
+ | 1.5331 | 18800 | 0.0027 |
524
+ | 1.5412 | 18900 | 0.0035 |
525
+ | 1.5494 | 19000 | 0.0025 |
526
+ | 1.5575 | 19100 | 0.0027 |
527
+ | 1.5657 | 19200 | 0.0026 |
528
+ | 1.5738 | 19300 | 0.0028 |
529
+ | 1.5820 | 19400 | 0.0032 |
530
+ | 1.5901 | 19500 | 0.0025 |
531
+ | 1.5983 | 19600 | 0.0027 |
532
+ | 1.6065 | 19700 | 0.0026 |
533
+ | 1.6146 | 19800 | 0.0034 |
534
+ | 1.6228 | 19900 | 0.0027 |
535
+ | 1.6309 | 20000 | 0.0027 |
536
+ | 1.6391 | 20100 | 0.0028 |
537
+ | 1.6472 | 20200 | 0.0031 |
538
+ | 1.6554 | 20300 | 0.0028 |
539
+ | 1.6635 | 20400 | 0.0025 |
540
+ | 1.6717 | 20500 | 0.0025 |
541
+ | 1.6798 | 20600 | 0.0026 |
542
+ | 1.6880 | 20700 | 0.003 |
543
+ | 1.6962 | 20800 | 0.0029 |
544
+ | 1.7043 | 20900 | 0.0027 |
545
+ | 1.7125 | 21000 | 0.0025 |
546
+ | 1.7206 | 21100 | 0.0029 |
547
+ | 1.7288 | 21200 | 0.0029 |
548
+ | 1.7369 | 21300 | 0.0027 |
549
+ | 1.7451 | 21400 | 0.0026 |
550
+ | 1.7532 | 21500 | 0.0025 |
551
+ | 1.7614 | 21600 | 0.003 |
552
+ | 1.7696 | 21700 | 0.0028 |
553
+ | 1.7777 | 21800 | 0.0024 |
554
+ | 1.7859 | 21900 | 0.0025 |
555
+ | 1.7940 | 22000 | 0.003 |
556
+ | 1.8022 | 22100 | 0.0026 |
557
+ | 1.8103 | 22200 | 0.0027 |
558
+ | 1.8185 | 22300 | 0.0027 |
559
+ | 1.8266 | 22400 | 0.0026 |
560
+ | 1.8348 | 22500 | 0.003 |
561
+ | 1.8429 | 22600 | 0.0029 |
562
+ | 1.8511 | 22700 | 0.0025 |
563
+ | 1.8593 | 22800 | 0.0026 |
564
+ | 1.8674 | 22900 | 0.0031 |
565
+ | 1.8756 | 23000 | 0.0027 |
566
+ | 1.8837 | 23100 | 0.0026 |
567
+ | 1.8919 | 23200 | 0.0025 |
568
+ | 1.9000 | 23300 | 0.0028 |
569
+ | 1.9082 | 23400 | 0.0027 |
570
+ | 1.9163 | 23500 | 0.0027 |
571
+ | 1.9245 | 23600 | 0.0027 |
572
+ | 1.9326 | 23700 | 0.0026 |
573
+ | 1.9408 | 23800 | 0.0031 |
574
+ | 1.9490 | 23900 | 0.0027 |
575
+ | 1.9571 | 24000 | 0.0027 |
576
+ | 1.9653 | 24100 | 0.0026 |
577
+ | 1.9734 | 24200 | 0.0032 |
578
+ | 1.9816 | 24300 | 0.0029 |
579
+ | 1.9897 | 24400 | 0.0026 |
580
+ | 1.9979 | 24500 | 0.0028 |
581
+ | 2.0060 | 24600 | 0.0029 |
582
+ | 2.0142 | 24700 | 0.0026 |
583
+ | 2.0223 | 24800 | 0.0027 |
584
+ | 2.0305 | 24900 | 0.0033 |
585
+ | 2.0387 | 25000 | 0.0026 |
586
+ | 2.0468 | 25100 | 0.0026 |
587
+ | 2.0550 | 25200 | 0.0024 |
588
+ | 2.0631 | 25300 | 0.0026 |
589
+ | 2.0713 | 25400 | 0.0033 |
590
+ | 2.0794 | 25500 | 0.0025 |
591
+ | 2.0876 | 25600 | 0.0026 |
592
+ | 2.0957 | 25700 | 0.0026 |
593
+ | 2.1039 | 25800 | 0.0033 |
594
+ | 2.1120 | 25900 | 0.0025 |
595
+ | 2.1202 | 26000 | 0.0026 |
596
+ | 2.1284 | 26100 | 0.0026 |
597
+ | 2.1365 | 26200 | 0.0025 |
598
+ | 2.1447 | 26300 | 0.0031 |
599
+ | 2.1528 | 26400 | 0.0026 |
600
+ | 2.1610 | 26500 | 0.0025 |
601
+ | 2.1691 | 26600 | 0.0026 |
602
+ | 2.1773 | 26700 | 0.0032 |
603
+ | 2.1854 | 26800 | 0.0026 |
604
+ | 2.1936 | 26900 | 0.0026 |
605
+ | 2.2017 | 27000 | 0.0025 |
606
+ | 2.2099 | 27100 | 0.0032 |
607
+ | 2.2181 | 27200 | 0.0025 |
608
+ | 2.2262 | 27300 | 0.0025 |
609
+ | 2.2344 | 27400 | 0.0024 |
610
+ | 2.2425 | 27500 | 0.0025 |
611
+ | 2.2507 | 27600 | 0.0033 |
612
+ | 2.2588 | 27700 | 0.0024 |
613
+ | 2.2670 | 27800 | 0.0024 |
614
+ | 2.2751 | 27900 | 0.0024 |
615
+ | 2.2833 | 28000 | 0.0033 |
616
+ | 2.2914 | 28100 | 0.0025 |
617
+ | 2.2996 | 28200 | 0.0024 |
618
+ | 2.3078 | 28300 | 0.0026 |
619
+ | 2.3159 | 28400 | 0.0024 |
620
+ | 2.3241 | 28500 | 0.0032 |
621
+ | 2.3322 | 28600 | 0.0025 |
622
+ | 2.3404 | 28700 | 0.0024 |
623
+ | 2.3485 | 28800 | 0.0024 |
624
+ | 2.3567 | 28900 | 0.0032 |
625
+ | 2.3648 | 29000 | 0.0025 |
626
+ | 2.3730 | 29100 | 0.0024 |
627
+ | 2.3811 | 29200 | 0.0024 |
628
+ | 2.3893 | 29300 | 0.0028 |
629
+ | 2.3975 | 29400 | 0.003 |
630
+ | 2.4056 | 29500 | 0.0023 |
631
+ | 2.4138 | 29600 | 0.0025 |
632
+ | 2.4219 | 29700 | 0.0024 |
633
+ | 2.4301 | 29800 | 0.0032 |
634
+ | 2.4382 | 29900 | 0.0025 |
635
+ | 2.4464 | 30000 | 0.0024 |
636
+ | 2.4545 | 30100 | 0.0023 |
637
+ | 2.4627 | 30200 | 0.003 |
638
+ | 2.4708 | 30300 | 0.0024 |
639
+ | 2.4790 | 30400 | 0.0025 |
640
+ | 2.4872 | 30500 | 0.0025 |
641
+ | 2.4953 | 30600 | 0.0025 |
642
+ | 2.5035 | 30700 | 0.0031 |
643
+ | 2.5116 | 30800 | 0.0022 |
644
+ | 2.5198 | 30900 | 0.0024 |
645
+ | 2.5279 | 31000 | 0.0024 |
646
+ | 2.5361 | 31100 | 0.0032 |
647
+ | 2.5442 | 31200 | 0.0024 |
648
+ | 2.5524 | 31300 | 0.0023 |
649
+ | 2.5605 | 31400 | 0.0025 |
650
+ | 2.5687 | 31500 | 0.0024 |
651
+ | 2.5769 | 31600 | 0.0031 |
652
+ | 2.5850 | 31700 | 0.0024 |
653
+ | 2.5932 | 31800 | 0.0024 |
654
+ | 2.6013 | 31900 | 0.0024 |
655
+ | 2.6095 | 32000 | 0.0031 |
656
+ | 2.6176 | 32100 | 0.0025 |
657
+ | 2.6258 | 32200 | 0.0025 |
658
+ | 2.6339 | 32300 | 0.0025 |
659
+ | 2.6421 | 32400 | 0.0027 |
660
+ | 2.6502 | 32500 | 0.0029 |
661
+ | 2.6584 | 32600 | 0.0024 |
662
+ | 2.6666 | 32700 | 0.0023 |
663
+ | 2.6747 | 32800 | 0.0025 |
664
+ | 2.6829 | 32900 | 0.0028 |
665
+ | 2.6910 | 33000 | 0.0026 |
666
+ | 2.6992 | 33100 | 0.0025 |
667
+ | 2.7073 | 33200 | 0.0024 |
668
+ | 2.7155 | 33300 | 0.0025 |
669
+ | 2.7236 | 33400 | 0.0026 |
670
+ | 2.7318 | 33500 | 0.0027 |
671
+ | 2.7399 | 33600 | 0.0025 |
672
+ | 2.7481 | 33700 | 0.0024 |
673
+ | 2.7563 | 33800 | 0.0028 |
674
+ | 2.7644 | 33900 | 0.0025 |
675
+ | 2.7726 | 34000 | 0.0024 |
676
+ | 2.7807 | 34100 | 0.0023 |
677
+ | 2.7889 | 34200 | 0.0027 |
678
+ | 2.7970 | 34300 | 0.0024 |
679
+ | 2.8052 | 34400 | 0.0025 |
680
+ | 2.8133 | 34500 | 0.0024 |
681
+ | 2.8215 | 34600 | 0.0024 |
682
+ | 2.8297 | 34700 | 0.0029 |
683
+ | 2.8378 | 34800 | 0.0027 |
684
+ | 2.8460 | 34900 | 0.0025 |
685
+ | 2.8541 | 35000 | 0.0023 |
686
+ | 2.8623 | 35100 | 0.0029 |
687
+ | 2.8704 | 35200 | 0.0025 |
688
+ | 2.8786 | 35300 | 0.0024 |
689
+ | 2.8867 | 35400 | 0.0024 |
690
+ | 2.8949 | 35500 | 0.0024 |
691
+ | 2.9030 | 35600 | 0.0028 |
692
+ | 2.9112 | 35700 | 0.0026 |
693
+ | 2.9194 | 35800 | 0.0023 |
694
+ | 2.9275 | 35900 | 0.0024 |
695
+ | 2.9357 | 36000 | 0.003 |
696
+ | 2.9438 | 36100 | 0.0025 |
697
+ | 2.9520 | 36200 | 0.0025 |
698
+ | 2.9601 | 36300 | 0.0024 |
699
+ | 2.9683 | 36400 | 0.0028 |
700
+ | 2.9764 | 36500 | 0.0027 |
701
+ | 2.9846 | 36600 | 0.0027 |
702
+ | 2.9927 | 36700 | 0.0025 |
703
+
704
+ </details>
705
+
706
+ ### Framework Versions
707
+ - Python: 3.11.2
708
+ - Sentence Transformers: 3.3.1
709
+ - Transformers: 4.49.0
710
+ - PyTorch: 2.5.1+cu124
711
+ - Accelerate: 1.0.1
712
+ - Datasets: 3.1.0
713
+ - Tokenizers: 0.21.0
714
+
715
+ ## Citation
716
+
717
+ ### BibTeX
718
+
719
+ #### Sentence Transformers
720
+ ```bibtex
721
+ @inproceedings{reimers-2019-sentence-bert,
722
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
723
+ author = "Reimers, Nils and Gurevych, Iryna",
724
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
725
+ month = "11",
726
+ year = "2019",
727
+ publisher = "Association for Computational Linguistics",
728
+ url = "https://arxiv.org/abs/1908.10084",
729
+ }
730
+ ```
731
+
732
+ #### ContrastiveLoss
733
+ ```bibtex
734
+ @inproceedings{hadsell2006dimensionality,
735
+ author={Hadsell, R. and Chopra, S. and LeCun, Y.},
736
+ booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
737
+ title={Dimensionality Reduction by Learning an Invariant Mapping},
738
+ year={2006},
739
+ volume={2},
740
+ number={},
741
+ pages={1735-1742},
742
+ doi={10.1109/CVPR.2006.100}
743
+ }
744
+ ```
745
+
746
+ <!--
747
+ ## Glossary
748
+
749
+ *Clearly define terms in order to be accessible across audiences.*
750
+ -->
751
+
752
+ <!--
753
+ ## Model Card Authors
754
+
755
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
756
+ -->
757
+
758
+ <!--
759
+ ## Model Card Contact
760
+
761
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
762
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-large-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 4096,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 16,
24
+ "num_hidden_layers": 24,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.49.0",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.49.0",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c5fbbe1b27a976024c628b566d028d31ebf72c3e24ab1ae775880361c195076c
3
+ size 1340612432
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff