|
|
--- |
|
|
datasets: |
|
|
- EleutherAI/pile |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text2text-generation |
|
|
tags: |
|
|
- summarization |
|
|
- translation |
|
|
--- |
|
|
|
|
|
# Model Card for T5v2 Base |
|
|
|
|
|
# Table of Contents |
|
|
|
|
|
1. [Model Details](#model-details) |
|
|
2. [Uses](#uses) |
|
|
3. [Bias, Risks, and Limitations](#bias-risks-and-limitations) |
|
|
4. [Training Details](#training-details) |
|
|
5. [Evaluation](#evaluation) |
|
|
6. [Environmental Impact](#environmental-impact) |
|
|
7. [Citation](#citation) |
|
|
8. [Model Card Authors](#model-card-authors) |
|
|
9. [How To Get Started With the Model](#how-to-get-started-with-the-model) |
|
|
|
|
|
# Model Details |
|
|
|
|
|
## Model Description |
|
|
|
|
|
More information needed. |
|
|
# Uses |
|
|
|
|
|
## Direct Use and Downstream Use |
|
|
|
|
|
More information needed. |
|
|
|
|
|
## Out-of-Scope Use |
|
|
|
|
|
More information needed. |
|
|
|
|
|
# Bias, Risks, and Limitations |
|
|
|
|
|
More information needed. |
|
|
|
|
|
## Recommendations |
|
|
|
|
|
More information needed. |
|
|
|
|
|
# Training Details |
|
|
|
|
|
## Training Data |
|
|
|
|
|
The model was pre-trained on the Pile using an unsupervised denoising objective, |
|
|
## Training Procedure |
|
|
|
|
|
More information needed. |
|
|
|
|
|
# Evaluation |
|
|
|
|
|
## Testing Data, Factors & Metrics |
|
|
|
|
|
More information needed. |
|
|
## Results |
|
|
|
|
|
More information needed. |
|
|
|
|
|
# Environmental Impact |
|
|
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
|
|
- **Hardware Type:** Google Cloud TPU Pods |
|
|
- **Hours used:** More information needed |
|
|
- **Cloud Provider:** GCP |
|
|
- **Compute Region:** More information needed |
|
|
- **Carbon Emitted:** More information needed |
|
|
|
|
|
# Citation |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
```bibtex |
|
|
@article{2024t5v2, |
|
|
author = {Lintang Sutawika and Aran Komatsuzaki and Colin Raffel}, |
|
|
title = {T5v2, an update of T5}, |
|
|
year = {2024}, |
|
|
url = {} |
|
|
} |
|
|
``` |
|
|
|
|
|
# How to Get Started with the Model |
|
|
|
|
|
Use the code below to get started with the model. |
|
|
|
|
|
<details> |
|
|
<summary> Click to expand </summary> |
|
|
|
|
|
```python |
|
|
from transformers import UMT5Tokenizer, UMT5Model |
|
|
|
|
|
tokenizer = UMT5Tokenizer.from_pretrained("EleutherAI/t5-v2-base") |
|
|
model = UMT5Model.from_pretrained("EleutherAI/t5-v2-base") |
|
|
|
|
|
input_ids = tokenizer( |
|
|
"Studies have been shown that owning a dog is good for you", return_tensors="pt" |
|
|
).input_ids # Batch size 1 |
|
|
decoder_input_ids = tokenizer("Studies show that", return_tensors="pt").input_ids # Batch size 1 |
|
|
|
|
|
# forward pass |
|
|
outputs = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids) |
|
|
last_hidden_states = outputs.last_hidden_state |
|
|
``` |
|
|
|
|
|
|
|
|
</details> |