daniyalfarh
/

text-summarization-T5

Generated from Trainer

Model card Files Files and versions Community

text-summarization-T5 / README.md

Daniyal Farhangi

text-summarization-T5

4e0794b verified 8 months ago

|

history blame contribute delete

2.94 kB

	---
	library_name: peft
	license: apache-2.0
	base_model: t5-small
	tags:
	- generated_from_trainer
	datasets:
	- xsum
	model-index:
	- name: text-summarization-T5
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# text-summarization-T5

	This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the xsum dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.6883

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 2

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 3.8764 \| 0.0627 \| 100 \| 3.6376 \|
	\| 3.6129 \| 0.1255 \| 200 \| 3.2631 \|
	\| 3.3392 \| 0.1882 \| 300 \| 3.0248 \|
	\| 3.207 \| 0.2509 \| 400 \| 2.9294 \|
	\| 3.1548 \| 0.3137 \| 500 \| 2.8725 \|
	\| 3.0969 \| 0.3764 \| 600 \| 2.8333 \|
	\| 3.0718 \| 0.4391 \| 700 \| 2.8018 \|
	\| 3.0476 \| 0.5018 \| 800 \| 2.7803 \|
	\| 3.0431 \| 0.5646 \| 900 \| 2.7651 \|
	\| 3.0216 \| 0.6273 \| 1000 \| 2.7538 \|
	\| 3.0003 \| 0.6900 \| 1100 \| 2.7440 \|
	\| 3.0018 \| 0.7528 \| 1200 \| 2.7363 \|
	\| 2.9993 \| 0.8155 \| 1300 \| 2.7289 \|
	\| 2.9833 \| 0.8782 \| 1400 \| 2.7236 \|
	\| 2.9827 \| 0.9410 \| 1500 \| 2.7181 \|
	\| 2.9737 \| 1.0037 \| 1600 \| 2.7145 \|
	\| 2.968 \| 1.0664 \| 1700 \| 2.7107 \|
	\| 2.967 \| 1.1291 \| 1800 \| 2.7074 \|
	\| 2.9709 \| 1.1919 \| 1900 \| 2.7042 \|
	\| 2.9593 \| 1.2546 \| 2000 \| 2.7011 \|
	\| 2.9628 \| 1.3173 \| 2100 \| 2.6987 \|
	\| 2.9573 \| 1.3801 \| 2200 \| 2.6969 \|
	\| 2.955 \| 1.4428 \| 2300 \| 2.6947 \|
	\| 2.9483 \| 1.5055 \| 2400 \| 2.6934 \|
	\| 2.9546 \| 1.5683 \| 2500 \| 2.6923 \|
	\| 2.9492 \| 1.6310 \| 2600 \| 2.6910 \|
	\| 2.9493 \| 1.6937 \| 2700 \| 2.6903 \|
	\| 2.9482 \| 1.7564 \| 2800 \| 2.6896 \|
	\| 2.9524 \| 1.8192 \| 2900 \| 2.6890 \|
	\| 2.9399 \| 1.8819 \| 3000 \| 2.6886 \|
	\| 2.9347 \| 1.9446 \| 3100 \| 2.6883 \|


	### Framework versions

	- PEFT 0.14.0
	- Transformers 4.44.2
	- Pytorch 2.4.1+cu121
	- Datasets 3.2.0
	- Tokenizers 0.19.1