Update README.md
Browse files
README.md
CHANGED
|
@@ -89,6 +89,10 @@ tokenizer.decode(summary_ids[0], skip_special_tokens=True, clean_up_tokenization
|
|
| 89 |
|
| 90 |
The model was pre-trained continuously on a single A10G GPU in an AWS instance for 133 hours with each epoch taking 45 hours using bf16 quantization.
|
| 91 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
#### Authors:
|
| 93 |
|
| 94 |
<a href="https://www.linkedin.com/in/bijaya-bhatta-69536018a/">Vijaya Bhatta</a>
|
|
|
|
| 89 |
|
| 90 |
The model was pre-trained continuously on a single A10G GPU in an AWS instance for 133 hours with each epoch taking 45 hours using bf16 quantization.
|
| 91 |
|
| 92 |
+
#### Possible Future Directions:
|
| 93 |
+
|
| 94 |
+
1. Use a decoder only model for pre-training and summarization.
|
| 95 |
+
|
| 96 |
#### Authors:
|
| 97 |
|
| 98 |
<a href="https://www.linkedin.com/in/bijaya-bhatta-69536018a/">Vijaya Bhatta</a>
|