File size: 721 Bytes
e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
# Encoder-Decoder model with DeBERTa decoder
## pre-trained models
Encoder: `microsoft/deberta-v3-small`
Decoder: `deliciouscat/deberta-v3-base-decoder-v0.1`; 6 transformer layers, 8 attention heads
## Data used
`HuggingFaceFW/fineweb` -> sampled 124800
## Training hparams
optimizer: AdamW, lr=2.3e-5, betas=(0.875, 0.997)
batch size: 12 (maximal on Colab pro A100 env)
## How to use
```
from transformers import AutoTokenizer, EncoderDecoderModel
model = EncoderDecoderModel.from_pretrained("patrickvonplaten/bert2bert_cnn_daily_mail")
tokenizer = AutoTokenizer.from_pretrained("patrickvonplaten/bert2bert_cnn_daily_mail")
```
## Future work!
train more scientific data
fine-tune on keyword extraction task |