language: | |
- de | |
tags: | |
- cross-encoder | |
widget: | |
- text: Was sind Lamas. Das Lama (Lama glama) ist eine Art der Kamele. Es ist in den | |
südamerikanischen Anden verbreitet und eine vom Guanako abstammende Haustierform. | |
example_title: Example Query / Paragraph | |
license: apache-2.0 | |
metrics: | |
- Rouge-Score | |
library_name: sentence-transformers | |
pipeline_tag: text-ranking | |
# cross-encoder-mmarco-german-distilbert-base | |
## Model description: | |
This model is a fine-tuned [cross-encoder](https://www.sbert.net/examples/training/cross-encoder/README.html) on the [MMARCO dataset](https://huggingface.co/datasets/unicamp-dl/mmarco) which is the machine translated version of the MS MARCO dataset. | |
As base model for the fine-tuning we use [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) | |
Model input samples are tuples of the following format, either | |
`<query, positive_paragraph>` assigned to 1 or `<query, negative_paragraph>` assigned to 0. | |
The model was trained for 1 epoch. | |
## Model usage | |
The cross-encoder model can be used like this: | |
``` | |
from sentence_transformers import CrossEncoder | |
model = CrossEncoder('model_name') | |
scores = model.predict([('Query 1', 'Paragraph 1'), ('Query 2', 'Paragraph 2')]) | |
``` | |
The model will predict scores for the pairs `('Query 1', 'Paragraph 1')` and `('Query 2', 'Paragraph 2')`. | |
For more details on the usage of the cross-encoder models have a look into the [Sentence-Transformers](https://www.sbert.net/) | |
## Model Performance: | |
Model evaluation was done on 2000 evaluation paragraphs of the dataset. | |
| Accuracy | F1-Score | Precision | Recall | | |
| --- | --- | --- | --- | | |
| 89.70 | 86.82 | 86.82 | 93.50 | |