| # Cross-Encoder for Quora Duplicate Questions Detection | |
| This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class. | |
| ## Training Data | |
| The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral. | |
| ## Usage | |
| Pre-trained models can be used like this: | |
| ```python | |
| from sentence_transformers import CrossEncoder | |
| model = CrossEncoder('model_name') | |
| scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')]) | |
| #Convert scores to labels | |
| label_mapping = ['contradiction', 'entailment', 'neutral'] | |
| labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)] | |
| ``` | |
| ## Usage with Transformers AutoModel | |
| You can use the model also directly with Transformers library (without SentenceTransformers library): | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification | |
| import torch | |
| model = AutoModelForSequenceClassification.from_pretrained('model_name') | |
| tokenizer = AutoTokenizer.from_pretrained('model_name') | |
| features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'], padding=True, truncation=True, return_tensors="pt") | |
| model.eval() | |
| with torch.no_grad(): | |
| scores = model(**features).logits | |
| label_mapping = ['contradiction', 'entailment', 'neutral'] | |
| labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)] | |
| print(labels) | |
| ``` |