langcache-embed-v3 / README.md
radoslavralev's picture
Add new SentenceTransformer model
8cd0c70 verified
|
raw
history blame
36.6 kB
metadata
language:
  - en
license: apache-2.0
tags:
  - biencoder
  - sentence-transformers
  - text-classification
  - sentence-pair-classification
  - semantic-similarity
  - semantic-search
  - retrieval
  - reranking
  - generated_from_trainer
  - dataset_size:483820
  - loss:MultipleNegativesSymmetricRankingLoss
base_model: Alibaba-NLP/gte-modernbert-base
widget:
  - source_sentence: >-
      See Precambrian time scale # Proposed Geologic timeline for another set of
      periods 4600 -- 541 MYA .
    sentences:
      - >-
        In 2014 election , Biju Janata Dal candidate Tathagat Satapathy
        Bharatiya Janata party candidate Rudra Narayan Pany defeated with a
        margin of 1.37,340 votes .
      - >-
        In Scotland , the Strathclyde Partnership for Transport , formerly known
        as Strathclyde Passenger Transport Executive , comprises the former
        Strathclyde region , which includes the urban area around Glasgow .
      - >-
        See Precambrian Time Scale # Proposed Geological Timeline for another
        set of periods of 4600 -- 541 MYA .
  - source_sentence: >-
      It is also 5 kilometers northeast of Tamaqua , 27 miles south of Allentown
      and 9 miles northwest of Hazleton .
    sentences:
      - In 1948 he moved to Massachusetts , and eventually settled in Vermont .
      - >-
        Suddenly I remembered that I was a New Zealander , I caught the first
        plane home and came back .
      - >-
        It is also 5 miles northeast of Tamaqua , 27 miles south of Allentown ,
        and 9 miles northwest of Hazleton .
  - source_sentence: >-
      The party has a Member of Parliament , a member of the House of Lords ,
      three members of the London Assembly and two Members of the European
      Parliament .
    sentences:
      - >-
        The party has one Member of Parliament , one member of the House of
        Lords , three Members of the London Assembly and two Members of the
        European Parliament .
      - >-
        Grapsid crabs dominate in Australia , Malaysia and Panama , while
        gastropods Cerithidea scalariformis and Melampus coeffeus are important
        seed predators in Florida mangroves .
      - >-
        Music Story is a music service website and international music data
        provider that curates , aggregates and analyses metadata for digital
        music services .
  - source_sentence: >-
      The play received two 1969 Tony Award nominations : Best Actress in a Play
      ( Michael Annals ) and Best Costume Design ( Charlotte Rae ) .
    sentences:
      - >-
        Ravishanker is a fellow of the International Statistical Institute and
        an elected member of the American Statistical Association .
      - >-
        In 1969 , the play received two Tony - Award nominations : Best Actress
        in a Theatre Play ( Michael Annals ) and Best Costume Design ( Charlotte
        Rae ) .
      - >-
        AMD and Nvidia both have proprietary methods of scaling , CrossFireX for
        AMD , and SLI for Nvidia .
  - source_sentence: >-
      He was a close friend of Ángel Cabrera and is a cousin of golfer Tony
      Croatto .
    sentences:
      - >-
        He was a close friend of Ángel Cabrera , and is a cousin of golfer Tony
        Croatto .
      - >-
        Eugenijus Bartulis ( born December 7 , 1949 in Kaunas ) is a Lithuanian
        Roman Catholic priest , and Bishop of Šiauliai .
      - >-
        UWIRE also distributes its members content to professional media outlets
        , including Yahoo , CNN and CBS News .
datasets:
  - redis/langcache-sentencepairs-v1
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_precision@1
  - cosine_recall@1
  - cosine_ndcg@10
  - cosine_mrr@1
  - cosine_map@100
model-index:
  - name: Redis fine-tuned BiEncoder model for semantic caching on LangCache
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: train
          type: train
        metrics:
          - type: cosine_accuracy@1
            value: 0.5978783286425633
            name: Cosine Accuracy@1
          - type: cosine_precision@1
            value: 0.5978783286425633
            name: Cosine Precision@1
          - type: cosine_recall@1
            value: 0.5765917883925028
            name: Cosine Recall@1
          - type: cosine_ndcg@10
            value: 0.7905393533594786
            name: Cosine Ndcg@10
          - type: cosine_mrr@1
            value: 0.5978783286425633
            name: Cosine Mrr@1
          - type: cosine_map@100
            value: 0.7375956597574003
            name: Cosine Map@100

Redis fine-tuned BiEncoder model for semantic caching on LangCache

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-modernbert-base on the LangCache Sentence Pairs (all) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("redis/langcache-embed-v3")
# Run inference
sentences = [
    'He was a close friend of Ángel Cabrera and is a cousin of golfer Tony Croatto .',
    'He was a close friend of Ángel Cabrera , and is a cousin of golfer Tony Croatto .',
    'UWIRE also distributes its members content to professional media outlets , including Yahoo , CNN and CBS News .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9922, 0.0547],
#         [0.9922, 1.0000, 0.0449],
#         [0.0547, 0.0449, 1.0000]], dtype=torch.bfloat16)

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.5979
cosine_precision@1 0.5979
cosine_recall@1 0.5766
cosine_ndcg@10 0.7905
cosine_mrr@1 0.5979
cosine_map@100 0.7376

Training Details

Training Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 26,850 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 8 tokens
    • mean: 27.35 tokens
    • max: 53 tokens
    • min: 8 tokens
    • mean: 27.27 tokens
    • max: 52 tokens
    • 1: 100.00%
  • Samples:
    sentence1 sentence2 label
    The newer Punts are still very much in existence today and race in the same fleets as the older boats . The newer punts are still very much in existence today and run in the same fleets as the older boats . 1
    After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall . Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall . 1
    The 12F was officially homologated on August 21 , 1929 and exhibited at the Paris Salon in 1930 . The 12F was officially homologated on 21 August 1929 and displayed at the 1930 Paris Salon . 1
  • Loss: MultipleNegativesSymmetricRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 26,850 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 8 tokens
    • mean: 27.35 tokens
    • max: 53 tokens
    • min: 8 tokens
    • mean: 27.27 tokens
    • max: 52 tokens
    • 1: 100.00%
  • Samples:
    sentence1 sentence2 label
    The newer Punts are still very much in existence today and race in the same fleets as the older boats . The newer punts are still very much in existence today and run in the same fleets as the older boats . 1
    After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall . Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall . 1
    The 12F was officially homologated on August 21 , 1929 and exhibited at the Paris Salon in 1930 . The 12F was officially homologated on 21 August 1929 and displayed at the 1930 Paris Salon . 1
  • Loss: MultipleNegativesSymmetricRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • learning_rate: 0.0003
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • max_steps: 200000
  • warmup_steps: 1000
  • load_best_model_at_end: True
  • optim: adamw_torch
  • ddp_find_unused_parameters: False
  • push_to_hub: True
  • hub_model_id: redis/langcache-embed-v3
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0003
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: 200000
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 1000
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: False
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: redis/langcache-embed-v3
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss train_cosine_ndcg@10
-1 -1 - - 0.7522
0.5291 1000 0.0231 0.1710 0.7518
1.0582 2000 0.0147 0.1552 0.7593
1.5873 3000 0.0126 0.1616 0.7603
2.1164 4000 0.0113 0.1301 0.7644
2.6455 5000 0.0119 0.1276 0.7659
3.1746 6000 0.0099 0.1270 0.7648
3.7037 7000 0.0101 0.1239 0.7676
4.2328 8000 0.0093 0.1267 0.7709
4.7619 9000 0.0092 0.1190 0.7711
5.2910 10000 0.0088 0.1145 0.7735
5.8201 11000 0.009 0.1172 0.7735
6.3492 12000 0.0083 0.1144 0.7749
6.8783 13000 0.0088 0.1140 0.7736
7.4074 14000 0.0083 0.1134 0.7751
7.9365 15000 0.0087 0.1108 0.7742
8.4656 16000 0.0084 0.1119 0.7759
8.9947 17000 0.0081 0.1125 0.7762
9.5238 18000 0.0081 0.1134 0.7768
10.0529 19000 0.008 0.1126 0.7766
10.5820 20000 0.0079 0.1119 0.7755
11.1111 21000 0.0078 0.1112 0.7781
11.6402 22000 0.008 0.1113 0.7778
12.1693 23000 0.0082 0.1066 0.7796
12.6984 24000 0.0078 0.1098 0.7775
13.2275 25000 0.0078 0.1089 0.7800
13.7566 26000 0.0074 0.1091 0.7779
14.2857 27000 0.0078 0.1061 0.7782
14.8148 28000 0.0074 0.1073 0.7769
15.3439 29000 0.0078 0.1022 0.7804
15.8730 30000 0.0078 0.1035 0.7799
16.4021 31000 0.0074 0.1046 0.7793
16.9312 32000 0.0074 0.1043 0.7817
17.4603 33000 0.0071 0.1056 0.7831
17.9894 34000 0.0074 0.1022 0.7820
18.5185 35000 0.0073 0.1035 0.7820
19.0476 36000 0.0074 0.1020 0.7836
19.5767 37000 0.0071 0.1036 0.7828
20.1058 38000 0.007 0.1029 0.7845
20.6349 39000 0.0071 0.1019 0.7835
21.1640 40000 0.007 0.0991 0.7849
21.6931 41000 0.0071 0.1013 0.7828
22.2222 42000 0.0073 0.1033 0.7833
22.7513 43000 0.0068 0.0996 0.7835
23.2804 44000 0.007 0.0976 0.7850
23.8095 45000 0.0069 0.0986 0.7840
24.3386 46000 0.0068 0.0992 0.7856
24.8677 47000 0.0068 0.0988 0.7838
25.3968 48000 0.0068 0.0980 0.7857
25.9259 49000 0.007 0.0976 0.7860
26.4550 50000 0.0071 0.0994 0.7850
26.9841 51000 0.0067 0.0984 0.7862
27.5132 52000 0.0064 0.0992 0.7845
28.0423 53000 0.0068 0.1021 0.7840
28.5714 54000 0.0066 0.0974 0.7863
29.1005 55000 0.0066 0.1001 0.7848
29.6296 56000 0.0067 0.0997 0.7848
30.1587 57000 0.0067 0.0965 0.7868
30.6878 58000 0.0067 0.0968 0.7858
31.2169 59000 0.0066 0.0973 0.7861
31.7460 60000 0.0067 0.0972 0.7865
32.2751 61000 0.0065 0.0991 0.7855
32.8042 62000 0.0062 0.0960 0.7871
33.3333 63000 0.0068 0.1006 0.7863
33.8624 64000 0.0063 0.0980 0.7872
34.3915 65000 0.0066 0.0957 0.7871
34.9206 66000 0.0066 0.0971 0.7870
35.4497 67000 0.0063 0.0982 0.7857
35.9788 68000 0.0067 0.0944 0.7871
36.5079 69000 0.0062 0.0961 0.7870
37.0370 70000 0.0061 0.0924 0.7880
37.5661 71000 0.0064 0.0928 0.7878
38.0952 72000 0.0065 0.0934 0.7888
38.6243 73000 0.0069 0.0948 0.7873
39.1534 74000 0.0064 0.0922 0.7885
39.6825 75000 0.0064 0.0937 0.7888
40.2116 76000 0.0059 0.0941 0.7882
40.7407 77000 0.0067 0.0934 0.7900
41.2698 78000 0.0064 0.0926 0.7888
41.7989 79000 0.006 0.0948 0.7880
42.3280 80000 0.006 0.0953 0.7876
42.8571 81000 0.0058 0.0955 0.7887
43.3862 82000 0.0065 0.0945 0.7875
43.9153 83000 0.0063 0.0928 0.7888
44.4444 84000 0.0065 0.0959 0.7883
44.9735 85000 0.0063 0.0956 0.7876
45.5026 86000 0.006 0.0946 0.7893
46.0317 87000 0.0062 0.0954 0.7908
46.5608 88000 0.0061 0.0960 0.7896
47.0899 89000 0.006 0.0953 0.7893
47.6190 90000 0.0058 0.0941 0.7899
48.1481 91000 0.0059 0.0950 0.7892
48.6772 92000 0.0066 0.0948 0.7890
49.2063 93000 0.0058 0.0947 0.7886
49.7354 94000 0.006 0.0952 0.7891
50.2646 95000 0.0058 0.0948 0.7885
50.7937 96000 0.0058 0.0945 0.7894
51.3228 97000 0.0059 0.0936 0.7901
51.8519 98000 0.0059 0.0950 0.7900
52.3810 99000 0.0058 0.0954 0.7893
52.9101 100000 0.0058 0.0946 0.7900
53.4392 101000 0.0056 0.0943 0.7900
53.9683 102000 0.006 0.0950 0.7895
54.4974 103000 0.0059 0.0937 0.7899
55.0265 104000 0.0061 0.0941 0.7897
55.5556 105000 0.0059 0.0941 0.7903
56.0847 106000 0.0057 0.0924 0.7904
56.6138 107000 0.006 0.0933 0.7901
57.1429 108000 0.0059 0.0948 0.7888
57.6720 109000 0.0061 0.0938 0.7899
58.2011 110000 0.0058 0.0942 0.7904
58.7302 111000 0.0056 0.0943 0.7913
59.2593 112000 0.0056 0.0949 0.7915
59.7884 113000 0.0058 0.0947 0.7907
60.3175 114000 0.0058 0.0939 0.7910
60.8466 115000 0.0058 0.0942 0.7906
61.3757 116000 0.0055 0.0933 0.7910
61.9048 117000 0.0055 0.0936 0.7913
62.4339 118000 0.0059 0.0937 0.7904
62.9630 119000 0.0057 0.0943 0.7908
63.4921 120000 0.0056 0.0934 0.7912
64.0212 121000 0.0058 0.0936 0.7909
64.5503 122000 0.0055 0.0942 0.7896
65.0794 123000 0.0058 0.0939 0.7901
65.6085 124000 0.0057 0.0936 0.7907
66.1376 125000 0.0054 0.0951 0.7901
66.6667 126000 0.0055 0.0942 0.7912
67.1958 127000 0.0057 0.0943 0.7914
67.7249 128000 0.0057 0.0937 0.7910
68.2540 129000 0.0057 0.0933 0.7918
68.7831 130000 0.0055 0.0935 0.7913
69.3122 131000 0.0053 0.0935 0.7908
69.8413 132000 0.0057 0.0937 0.7905
70.3704 133000 0.0055 0.0940 0.7912
70.8995 134000 0.0052 0.0937 0.7913
71.4286 135000 0.005 0.0940 0.7917
71.9577 136000 0.0053 0.0933 0.7914
72.4868 137000 0.0056 0.0940 0.7915
73.0159 138000 0.0054 0.0937 0.7909
73.5450 139000 0.0051 0.0940 0.7909
74.0741 140000 0.0058 0.0938 0.7911
74.6032 141000 0.0056 0.0938 0.7912
75.1323 142000 0.0052 0.0931 0.7908
75.6614 143000 0.0052 0.0937 0.7905
76.1905 144000 0.0054 0.0940 0.7905
76.7196 145000 0.0055 0.0940 0.7907
77.2487 146000 0.0053 0.0941 0.7909
77.7778 147000 0.0057 0.0944 0.7907
78.3069 148000 0.0054 0.0947 0.7909
78.8360 149000 0.0054 0.0949 0.7907
79.3651 150000 0.0055 0.0948 0.7907
79.8942 151000 0.0058 0.0950 0.7907
80.4233 152000 0.0054 0.0946 0.7907
80.9524 153000 0.0053 0.0949 0.7909
81.4815 154000 0.0055 0.0947 0.7908
82.0106 155000 0.0053 0.0946 0.7906
82.5397 156000 0.0053 0.0949 0.7906
83.0688 157000 0.0051 0.0948 0.7912
83.5979 158000 0.0052 0.0954 0.7906
84.1270 159000 0.0054 0.0953 0.7908
84.6561 160000 0.005 0.0951 0.7911
85.1852 161000 0.0054 0.0953 0.7910
85.7143 162000 0.0056 0.0957 0.7907
86.2434 163000 0.0054 0.0953 0.7909
86.7725 164000 0.0051 0.0955 0.7912
87.3016 165000 0.0055 0.0956 0.7911
87.8307 166000 0.0056 0.0954 0.7909
88.3598 167000 0.0052 0.0955 0.7911
88.8889 168000 0.0052 0.0953 0.7910
89.4180 169000 0.0052 0.0952 0.7906
89.9471 170000 0.0053 0.0952 0.7908
90.4762 171000 0.0052 0.0954 0.7908
91.0053 172000 0.0054 0.0954 0.7907
91.5344 173000 0.0052 0.0951 0.7909
92.0635 174000 0.0053 0.0951 0.7907
92.5926 175000 0.0051 0.0950 0.7906
93.1217 176000 0.0054 0.0953 0.7907
93.6508 177000 0.0052 0.0953 0.7907
94.1799 178000 0.0051 0.0951 0.7908
94.7090 179000 0.0052 0.0952 0.7906
95.2381 180000 0.0053 0.0953 0.7909
95.7672 181000 0.0052 0.0953 0.7908
96.2963 182000 0.0051 0.0952 0.7906
96.8254 183000 0.0053 0.0953 0.7907
97.3545 184000 0.0051 0.0953 0.7907
97.8836 185000 0.0049 0.0953 0.7906
98.4127 186000 0.0051 0.0953 0.7907
98.9418 187000 0.0051 0.0954 0.7906
99.4709 188000 0.0053 0.0954 0.7906
100.0 189000 0.0051 0.0954 0.7904
100.5291 190000 0.0054 0.0953 0.7907
101.0582 191000 0.0052 0.0954 0.7905
101.5873 192000 0.0051 0.0954 0.7907
102.1164 193000 0.0052 0.0953 0.7907
102.6455 194000 0.0051 0.0955 0.7908
103.1746 195000 0.0054 0.0954 0.7906
103.7037 196000 0.0052 0.0954 0.7905
104.2328 197000 0.0053 0.0954 0.7906
104.7619 198000 0.0052 0.0954 0.7907
105.2910 199000 0.0055 0.0954 0.7904
105.8201 200000 0.0054 0.0955 0.7905
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}