metadata
language:
- en
license: apache-2.0
tags:
- biencoder
- sentence-transformers
- text-classification
- sentence-pair-classification
- semantic-similarity
- semantic-search
- retrieval
- reranking
- generated_from_trainer
- dataset_size:1451941
- loss:MultipleNegativesRankingLoss
base_model: Alibaba-NLP/gte-modernbert-base
widget:
- source_sentence: >-
Gocharya ji authored Krishna Cahrit Manas in the poetic form describing
about the full life of Lord Krishna ( from birth to Nirvana ) .
sentences:
- 'Q: Can I buy coverage for prescription drugs right away?'
- >-
Krishna Cahrit Manas in poetic form , describing the full life of Lord
Krishna ( from birth to nirvana ) , wrote Gocharya ji .
- >-
Baron played actress Violet Carson who portrayed Ena Sharples in the
soap .
- source_sentence: The Kilkenny line only reached Maryborough in 1867 .
sentences:
- It was also known formerly as ' Crotto ' .
- The line from Maryborough only reached Kilkenny in 1867 .
- The line from Kilkenny only reached Maryborough in 1867 .
- source_sentence: >-
Tokelau International Netball Team represents Tokelau in the national
netball .
sentences:
- >-
Ernest Dewey Albinson ( 1898 in Minneapolis , Minnesota - 1971 in Mexico
) was an American artist .
- >-
The Tokelau national netball team represents Tokelau in international
netball .
- >-
The Tokelau international netball team represents Tokelau in national
netball .
- source_sentence: >-
The real number is called the `` imaginary part `` of the real number ;
the real number is called the `` complex part `` of .
sentences:
- >-
The school board consists of Robbie Sanders , Bryan Richards , Linda
Fullingim , Lori Lambert , & Kelly Teague .
- Which web design company has the best templates?
- >-
The real number is called the `` imaginary part `` of the real number ,
the real number of `` complex part `` of .
- source_sentence: >-
All For You was the third and last single of Kate Ryan 's third album ``
Alive `` .
sentences:
- >-
According to John Keay , he was `` country bred `` ( born and educated
in India ) .
- >-
All For You was the third single of the third and last album `` Alive ``
by Kate Ryan .
- >-
All For You was the third and last single of the third album of Kate
Ryan `` Alive `` .
datasets:
- redis/langcache-sentencepairs-v1
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_precision@1
- cosine_recall@1
- cosine_ndcg@10
- cosine_mrr@1
- cosine_map@100
model-index:
- name: Redis fine-tuned BiEncoder model for semantic caching on LangCache
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: train
type: train
metrics:
- type: cosine_accuracy@1
value: 0.5578696687594717
name: Cosine Accuracy@1
- type: cosine_precision@1
value: 0.5578696687594717
name: Cosine Precision@1
- type: cosine_recall@1
value: 0.53589188426978
name: Cosine Recall@1
- type: cosine_ndcg@10
value: 0.7523955452910316
name: Cosine Ndcg@10
- type: cosine_mrr@1
value: 0.5578696687594717
name: Cosine Mrr@1
- type: cosine_map@100
value: 0.6976030263836698
name: Cosine Map@100
Redis fine-tuned BiEncoder model for semantic caching on LangCache
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-modernbert-base on the LangCache Sentence Pairs (all) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-modernbert-base
- Maximum Sequence Length: 100 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("redis/langcache-embed-v3")
# Run inference
sentences = [
"All For You was the third and last single of Kate Ryan 's third album `` Alive `` .",
'All For You was the third and last single of the third album of Kate Ryan `` Alive `` .',
'All For You was the third single of the third and last album `` Alive `` by Kate Ryan .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[0.9961, 0.9922, 0.9961],
# [0.9922, 1.0000, 0.9922],
# [0.9961, 0.9922, 1.0078]], dtype=torch.bfloat16)
Evaluation
Metrics
Information Retrieval
- Dataset:
train - Evaluated with
InformationRetrievalEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.5579 |
| cosine_precision@1 | 0.5579 |
| cosine_recall@1 | 0.5359 |
| cosine_ndcg@10 | 0.7524 |
| cosine_mrr@1 | 0.5579 |
| cosine_map@100 | 0.6976 |
Training Details
Training Dataset
LangCache Sentence Pairs (all)
- Dataset: LangCache Sentence Pairs (all)
- Size: 109,885 training samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 27.27 tokens
- max: 49 tokens
- min: 8 tokens
- mean: 27.27 tokens
- max: 48 tokens
- min: 7 tokens
- mean: 26.47 tokens
- max: 61 tokens
- Samples:
anchor positive negative The newer Punts are still very much in existence today and race in the same fleets as the older boats .The newer punts are still very much in existence today and run in the same fleets as the older boats .how can I get financial freedom as soon as possible?The newer punts are still very much in existence today and run in the same fleets as the older boats .The newer Punts are still very much in existence today and race in the same fleets as the older boats .The older Punts are still very much in existence today and race in the same fleets as the newer boats .Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada . - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Evaluation Dataset
LangCache Sentence Pairs (all)
- Dataset: LangCache Sentence Pairs (all)
- Size: 109,885 evaluation samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 27.27 tokens
- max: 49 tokens
- min: 8 tokens
- mean: 27.27 tokens
- max: 48 tokens
- min: 7 tokens
- mean: 26.47 tokens
- max: 61 tokens
- Samples:
anchor positive negative The newer Punts are still very much in existence today and race in the same fleets as the older boats .The newer punts are still very much in existence today and run in the same fleets as the older boats .how can I get financial freedom as soon as possible?The newer punts are still very much in existence today and run in the same fleets as the older boats .The newer Punts are still very much in existence today and race in the same fleets as the older boats .The older Punts are still very much in existence today and race in the same fleets as the newer boats .Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada . - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Logs
| Epoch | Step | train_cosine_ndcg@10 |
|---|---|---|
| -1 | -1 | 0.7524 |
Framework Versions
- Python: 3.12.3
- Sentence Transformers: 5.1.0
- Transformers: 4.56.0
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}