t5-small-openassistant-chat

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 80
eval_batch_size: 1
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
3.3768	1.0	301	2.3842
2.6839	2.0	602	2.3277
2.6351	3.0	903	2.2995
2.6016	4.0	1204	2.2818
2.5803	5.0	1505	2.2680
2.5587	6.0	1806	2.2571
2.541	7.0	2107	2.2481
2.5323	8.0	2408	2.2409
2.5102	9.0	2709	2.2349
2.5063	10.0	3010	2.2288
2.4953	11.0	3311	2.2242
2.4926	12.0	3612	2.2192
2.4786	13.0	3913	2.2154
2.472	14.0	4214	2.2117
2.4662	15.0	4515	2.2079
2.4553	16.0	4816	2.2051
2.4472	17.0	5117	2.2020
2.4488	18.0	5418	2.2008
2.4367	19.0	5719	2.1972
2.4353	20.0	6020	2.1952
2.429	21.0	6321	2.1934
2.4247	22.0	6622	2.1912
2.4242	23.0	6923	2.1901
2.4196	24.0	7224	2.1887
2.4169	25.0	7525	2.1873
2.4122	26.0	7826	2.1862
2.4089	27.0	8127	2.1851
2.4042	28.0	8428	2.1841
2.4061	29.0	8729	2.1831
2.4007	30.0	9030	2.1823
2.397	31.0	9331	2.1814
2.3998	32.0	9632	2.1810
2.3963	33.0	9933	2.1805
2.3976	34.0	10234	2.1798
2.3919	35.0	10535	2.1794
2.3873	36.0	10836	2.1793
2.3899	37.0	11137	2.1789
2.3886	38.0	11438	2.1786
2.3906	39.0	11739	2.1786
2.393	40.0	12040	2.1785