t5-small-openassistant-chat

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1785

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 80
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.3768 1.0 301 2.3842
2.6839 2.0 602 2.3277
2.6351 3.0 903 2.2995
2.6016 4.0 1204 2.2818
2.5803 5.0 1505 2.2680
2.5587 6.0 1806 2.2571
2.541 7.0 2107 2.2481
2.5323 8.0 2408 2.2409
2.5102 9.0 2709 2.2349
2.5063 10.0 3010 2.2288
2.4953 11.0 3311 2.2242
2.4926 12.0 3612 2.2192
2.4786 13.0 3913 2.2154
2.472 14.0 4214 2.2117
2.4662 15.0 4515 2.2079
2.4553 16.0 4816 2.2051
2.4472 17.0 5117 2.2020
2.4488 18.0 5418 2.2008
2.4367 19.0 5719 2.1972
2.4353 20.0 6020 2.1952
2.429 21.0 6321 2.1934
2.4247 22.0 6622 2.1912
2.4242 23.0 6923 2.1901
2.4196 24.0 7224 2.1887
2.4169 25.0 7525 2.1873
2.4122 26.0 7826 2.1862
2.4089 27.0 8127 2.1851
2.4042 28.0 8428 2.1841
2.4061 29.0 8729 2.1831
2.4007 30.0 9030 2.1823
2.397 31.0 9331 2.1814
2.3998 32.0 9632 2.1810
2.3963 33.0 9933 2.1805
2.3976 34.0 10234 2.1798
2.3919 35.0 10535 2.1794
2.3873 36.0 10836 2.1793
2.3899 37.0 11137 2.1789
2.3886 38.0 11438 2.1786
2.3906 39.0 11739 2.1786
2.393 40.0 12040 2.1785

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.6.0+cu124
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
71
Safetensors
Model size
60.5M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KMH158/t5-small-openassistant-chat

Base model

google-t5/t5-small
Finetuned
(2083)
this model
Finetunes
1 model