metadata
language:
- en
license: apache-2.0
tags:
- biencoder
- sentence-transformers
- text-classification
- sentence-pair-classification
- semantic-similarity
- semantic-search
- retrieval
- reranking
- generated_from_trainer
- dataset_size:3587
- loss:CustomBCELoss
base_model: Alibaba-NLP/gte-modernbert-base
widget:
- source_sentence: Hunter College was originally Lehman College 's uptown campus .
sentences:
- >-
Acquired programming includes the Irish soap `` Fair City `` and Finnish
drama `` Black Widows `` .
- >-
According to the United States Census Bureau , the town has a total area
of ; of the area is land and 0.66 % is water .
- Hunter College originally was Lehman College Uptown Campus .
- source_sentence: He hoped to defeat them and then marry Ravonna .
sentences:
- >-
Stillwater Creek received its official name in 1884 when William L.
Couch established his `` boomer colony `` on its banks .
- Note that the invertible of a matrix is always an exponential matrix .
- He hoped to defeat them and marry Ravonna .
- source_sentence: >-
Born on February 2 , 1984 , Abrar Khan is a professional Pakistani
international Kabaddi player .
sentences:
- >-
Born on February 2 , 1984 , Abrar Khan is a professional Pakistani
international Kabaddi player .
- >-
Together , the paired mylohyoid muscles form a muscular floor for the
oral cavity of the mouth .
- >-
Abrar Khan born 2 February 1984 is a Pakistani professional
international Kabaddi player .
- source_sentence: >-
Certainly , `` Lucy was nothing like flat `` in physical form , social
condition , and personality .
sentences:
- >-
The real number is called the `` imaginary part `` of the real number ;
the real number is called the `` complex part `` of .
- >-
From the Celebes lake , the captain Bullock observed the appearance of
the corona , while Gustav Fritsch accompanied an expedition to Aden .
- >-
Certainly `` Lucy was , in physical form , social condition and
personality , nothing like Shallow `` .
- source_sentence: >-
The trio has performed besides Gesaffelstein , Justice , Bob Moses and Lee
Foss .
sentences:
- >-
The trio has performed besides Gesaffelstein , Justice , Bob Moses and
Lee Foss .
- >-
The suttas generally contain educational content , while other early
Buddhist texts deal with monastic discipline or vinaya .
- >-
The trio has performed alongside Bob Moses , Justice , Gesaffelstein and
Lee Foss .
datasets:
- redis/langcache-sentencepairs-v2
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_precision@1
- cosine_recall@1
- cosine_ndcg@10
- cosine_mrr@1
- cosine_map@100
model-index:
- name: Redis fine-tuned BiEncoder model for semantic caching on LangCache
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: test
type: test
metrics:
- type: cosine_accuracy@1
value: 0.5861241448475948
name: Cosine Accuracy@1
- type: cosine_precision@1
value: 0.5861241448475948
name: Cosine Precision@1
- type: cosine_recall@1
value: 0.5679885764966713
name: Cosine Recall@1
- type: cosine_ndcg@10
value: 0.773078207125666
name: Cosine Ndcg@10
- type: cosine_mrr@1
value: 0.5861241448475948
name: Cosine Mrr@1
- type: cosine_map@100
value: 0.7217228927629071
name: Cosine Map@100
Redis fine-tuned BiEncoder model for semantic caching on LangCache
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-modernbert-base on the LangCache Sentence Pairs (all) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-modernbert-base
- Maximum Sequence Length: 100 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("redis/langcache-embed-v3")
# Run inference
sentences = [
'The trio has performed besides Gesaffelstein , Justice , Bob Moses and Lee Foss .',
'The trio has performed besides Gesaffelstein , Justice , Bob Moses and Lee Foss .',
'The trio has performed alongside Bob Moses , Justice , Gesaffelstein and Lee Foss .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[0.9961, 0.9961, 0.9844],
# [0.9961, 0.9961, 0.9844],
# [0.9844, 0.9844, 0.9961]], dtype=torch.bfloat16)
Evaluation
Metrics
Information Retrieval
- Dataset:
test - Evaluated with
InformationRetrievalEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.5861 |
| cosine_precision@1 | 0.5861 |
| cosine_recall@1 | 0.568 |
| cosine_ndcg@10 | 0.7731 |
| cosine_mrr@1 | 0.5861 |
| cosine_map@100 | 0.7217 |
Training Details
Training Dataset
LangCache Sentence Pairs (all)
- Dataset: LangCache Sentence Pairs (all)
- Size: 1,922 training samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 27.26 tokens
- max: 49 tokens
- min: 8 tokens
- mean: 27.24 tokens
- max: 49 tokens
- min: 9 tokens
- mean: 27.09 tokens
- max: 49 tokens
- Samples:
anchor positive negative The newer Punts are still very much in existence today and race in the same fleets as the older boats .The newer punts are still very much in existence today and run in the same fleets as the older boats .At that time , on June 22 , 1754 , Edward Bentham married Bentham Elizabeth Bates ( d . 1790 ) from Hampshire in the nearby county of Alton .The newer punts are still very much in existence today and run in the same fleets as the older boats .The newer Punts are still very much in existence today and race in the same fleets as the older boats .In 2012 , Cornell 5th and Lehigh 8th , Cornell was also 4th in 2013 and 7th in 2014 .Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada . - Loss:
losses.CustomBCELoss
Evaluation Dataset
LangCache Sentence Pairs (all)
- Dataset: LangCache Sentence Pairs (all)
- Size: 1,922 evaluation samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 27.26 tokens
- max: 49 tokens
- min: 8 tokens
- mean: 27.24 tokens
- max: 49 tokens
- min: 9 tokens
- mean: 27.09 tokens
- max: 49 tokens
- Samples:
anchor positive negative The newer Punts are still very much in existence today and race in the same fleets as the older boats .The newer punts are still very much in existence today and run in the same fleets as the older boats .At that time , on June 22 , 1754 , Edward Bentham married Bentham Elizabeth Bates ( d . 1790 ) from Hampshire in the nearby county of Alton .The newer punts are still very much in existence today and run in the same fleets as the older boats .The newer Punts are still very much in existence today and race in the same fleets as the older boats .In 2012 , Cornell 5th and Lehigh 8th , Cornell was also 4th in 2013 and 7th in 2014 .Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada . - Loss:
losses.CustomBCELoss
Training Logs
| Epoch | Step | test_cosine_ndcg@10 |
|---|---|---|
| -1 | -1 | 0.7731 |
Framework Versions
- Python: 3.12.3
- Sentence Transformers: 5.1.0
- Transformers: 4.56.0
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}