| --- |
| language: |
| - it |
| license: apache-2.0 |
| tags: |
| - italian |
| - sequence-to-sequence |
| - style-transfer |
| - formality-style-transfer |
| datasets: |
| - yahoo/xformal_it |
| widget: |
| - text: "maronn qualcuno mi spieg' CHECCOSA SUCCEDE?!?!" |
| - text: "wellaaaaaaa, ma fraté sei proprio troppo simpatiko, grazieeee!!" |
| - text: "nn capisco xke tt i ragazzi lo fanno" |
| - text: "IT5 è SUPERMEGA BRAVISSIMO a capire tt il vernacolo italiano!!!" |
| metrics: |
| - rouge |
| - bertscore |
| model-index: |
| - name: mt5-base-informal-to-formal |
| results: |
| - task: |
| type: formality-style-transfer |
| name: "Informal-to-formal Style Transfer" |
| dataset: |
| type: xformal_it |
| name: "XFORMAL (Italian Subset)" |
| metrics: |
| - type: rouge1 |
| value: 0.661 |
| name: "Avg. Test Rouge1" |
| - type: rouge2 |
| value: 0.471 |
| name: "Avg. Test Rouge2" |
| - type: rougeL |
| value: 0.642 |
| name: "Avg. Test RougeL" |
| - type: bertscore |
| value: 0.712 |
| name: "Avg. Test BERTScore" |
| args: |
| - model_type: "dbmdz/bert-base-italian-xxl-uncased" |
| - lang: "it" |
| - num_layers: 10 |
| - rescale_with_baseline: True |
| - baseline_path: "bertscore_baseline_ita.tsv" |
| co2_eq_emissions: |
| emissions: "40g" |
| source: "Google Cloud Platform Carbon Footprint" |
| training_type: "fine-tuning" |
| geographical_location: "Eemshaven, Netherlands, Europe" |
| hardware_used: "1 TPU v3-8 VM" |
| --- |
| |
| # mT5 Base for Informal-to-formal Style Transfer 🧐 |
|
|
| This repository contains the checkpoint for the [mT5 Base](https://huggingface.co/google/mt5-base) model fine-tuned on Informal-to-formal style transfer on the Italian subset of the XFORMAL dataset as part of the experiments of the paper [IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation](https://arxiv.org/abs/2203.03759) by [Gabriele Sarti](https://gsarti.com) and [Malvina Nissim](https://malvinanissim.github.io). |
|
|
| A comprehensive overview of other released materials is provided in the [gsarti/it5](https://github.com/gsarti/it5) repository. Refer to the paper for additional details concerning the reported scores and the evaluation approach. |
|
|
| ## Using the model |
|
|
| Model checkpoints are available for usage in Tensorflow, Pytorch and JAX. They can be used directly with pipelines as: |
|
|
| ```python |
| from transformers import pipelines |
| |
| i2f = pipeline("text2text-generation", model='it5/mt5-base-informal-to-formal') |
| i2f("nn capisco xke tt i ragazzi lo fanno") |
| >>> [{"generated_text": "non comprendo perché tutti i ragazzi agiscono così"}] |
| ``` |
|
|
| or loaded using autoclasses: |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
| |
| tokenizer = AutoTokenizer.from_pretrained("it5/mt5-base-informal-to-formal") |
| model = AutoModelForSeq2SeqLM.from_pretrained("it5/mt5-base-informal-to-formal") |
| ``` |
|
|
| If you use this model in your research, please cite our work as: |
|
|
| ```bibtex |
| @article{sarti-nissim-2022-it5, |
| title={{IT5}: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation}, |
| author={Sarti, Gabriele and Nissim, Malvina}, |
| journal={ArXiv preprint 2203.03759}, |
| url={https://arxiv.org/abs/2203.03759}, |
| year={2022}, |
| month={mar} |
| } |
| ``` |