Update README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,7 @@ metrics:
|
|
| 15 |
- comet
|
| 16 |
pipeline_tag: translation
|
| 17 |
---
|
| 18 |
-
# Model Card for
|
| 19 |
|
| 20 |
## Model Details
|
| 21 |
|
|
@@ -24,7 +24,7 @@ pipeline_tag: translation
|
|
| 24 |
TowerInstruct-Mistral-7B-v0.2 is a language model that results from fine-tuning a Mistral version of TowerBase on the TowerBlocks supervised fine-tuning dataset.
|
| 25 |
The model is trained to handle several translation-related tasks, such as general machine translation (e.g., sentence- and paragraph/document-level translation, terminology-aware translation, context-aware translation), automatic post edition, named-entity recognition, gramatical error correction, and paraphrase generation.
|
| 26 |
|
| 27 |
-
This model has performance comparable to [TowerInstruct-13B-v0.2](https://huggingface.co/Unbabel/TowerInstruct-13B-v0.1), while being half the size. Check out our [paper](https://
|
| 28 |
|
| 29 |
- **Developed by:** Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay
|
| 30 |
- **Model type:** A 7B parameter model fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions.
|
|
@@ -58,7 +58,7 @@ Here's how you can run the model using the `pipeline()` function from 🤗 Trans
|
|
| 58 |
import torch
|
| 59 |
from transformers import pipeline
|
| 60 |
|
| 61 |
-
pipe = pipeline("text-generation", model="Unbabel/
|
| 62 |
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
|
| 63 |
messages = [
|
| 64 |
{"role": "user", "content": "Translate the following text from Portuguese into English.\nPortuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.\nEnglish:"},
|
|
@@ -81,11 +81,11 @@ We are currently working on improving quality and consistency on document-level
|
|
| 81 |
|
| 82 |
## Bias, Risks, and Limitations
|
| 83 |
|
| 84 |
-
|
| 85 |
|
| 86 |
## Prompt Format
|
| 87 |
|
| 88 |
-
|
| 89 |
```
|
| 90 |
<|im_start|>user
|
| 91 |
{USER PROMPT}<|im_end|>
|
|
@@ -108,13 +108,13 @@ Link to [TowerBlocks](https://huggingface.co/datasets/Unbabel/TowerBlocks-v0.1).
|
|
| 108 |
## Citation
|
| 109 |
|
| 110 |
```bibtex
|
| 111 |
-
@
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
|
| 118 |
}
|
| 119 |
```
|
| 120 |
|
|
|
|
| 15 |
- comet
|
| 16 |
pipeline_tag: translation
|
| 17 |
---
|
| 18 |
+
# Model Card for TowerInstruct-Mistral-7B-v0.2
|
| 19 |
|
| 20 |
## Model Details
|
| 21 |
|
|
|
|
| 24 |
TowerInstruct-Mistral-7B-v0.2 is a language model that results from fine-tuning a Mistral version of TowerBase on the TowerBlocks supervised fine-tuning dataset.
|
| 25 |
The model is trained to handle several translation-related tasks, such as general machine translation (e.g., sentence- and paragraph/document-level translation, terminology-aware translation, context-aware translation), automatic post edition, named-entity recognition, gramatical error correction, and paraphrase generation.
|
| 26 |
|
| 27 |
+
This model has performance comparable to [TowerInstruct-13B-v0.2](https://huggingface.co/Unbabel/TowerInstruct-13B-v0.1), while being half the size. Check out our [paper in COLM 2024](https://openreview.net/pdf?id=EHPns3hVkj).
|
| 28 |
|
| 29 |
- **Developed by:** Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay
|
| 30 |
- **Model type:** A 7B parameter model fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions.
|
|
|
|
| 58 |
import torch
|
| 59 |
from transformers import pipeline
|
| 60 |
|
| 61 |
+
pipe = pipeline("text-generation", model="Unbabel/TowerInstruct-Mistral-7B-v0.2", torch_dtype=torch.bfloat16, device_map="auto")
|
| 62 |
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
|
| 63 |
messages = [
|
| 64 |
{"role": "user", "content": "Translate the following text from Portuguese into English.\nPortuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.\nEnglish:"},
|
|
|
|
| 81 |
|
| 82 |
## Bias, Risks, and Limitations
|
| 83 |
|
| 84 |
+
TowerInstruct-Mistral-7B-v0.2 has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements).
|
| 85 |
|
| 86 |
## Prompt Format
|
| 87 |
|
| 88 |
+
TowerInstruct-Mistral-7B-v0.2 was trained using the ChatML prompt templates without any system prompts. An example follows below:
|
| 89 |
```
|
| 90 |
<|im_start|>user
|
| 91 |
{USER PROMPT}<|im_end|>
|
|
|
|
| 108 |
## Citation
|
| 109 |
|
| 110 |
```bibtex
|
| 111 |
+
@inproceedings{
|
| 112 |
+
alves2024tower,
|
| 113 |
+
title={Tower: An Open Multilingual Large Language Model for Translation-Related Tasks},
|
| 114 |
+
author={Duarte Miguel Alves and Jos{\'e} Pombal and Nuno M Guerreiro and Pedro Henrique Martins and Jo{\~a}o Alves and Amin Farajian and Ben Peters and Ricardo Rei and Patrick Fernandes and Sweta Agrawal and Pierre Colombo and Jos{\'e} G. C. de Souza and Andre Martins},
|
| 115 |
+
booktitle={First Conference on Language Modeling},
|
| 116 |
+
year={2024},
|
| 117 |
+
url={https://openreview.net/forum?id=EHPns3hVkj}
|
| 118 |
}
|
| 119 |
```
|
| 120 |
|