ke-t5-scratch / README.md
madatnlp's picture
End of training
ea3ec97
|
raw
history blame
4.86 kB
metadata
tags:
  - generated_from_keras_callback
model-index:
  - name: madatnlp/ke-t5-scratch
    results: []

madatnlp/ke-t5-scratch

This model is a fine-tuned version of madatnlp/ke-t5-math-py on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 1.8367
  • Validation Loss: 1.5850
  • Epoch: 88

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'Adam', 'learning_rate': 1e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
13.5076 11.8125 0
11.0983 9.4857 1
9.4413 7.9593 2
8.2675 6.9802 3
7.3769 6.1898 4
6.6978 5.6209 5
6.2266 5.1054 6
5.7871 4.9395 7
5.4937 4.6256 8
5.2013 4.4694 9
4.9649 4.1716 10
4.7273 4.0317 11
4.5237 3.7622 12
4.3581 3.4826 13
4.2078 3.4463 14
4.0755 3.2685 15
3.9494 3.1492 16
3.8338 3.1535 17
3.6767 2.8725 18
3.6546 3.1201 19
3.5395 3.0338 20
3.4086 2.9991 21
3.3886 2.8730 22
3.2900 2.8334 23
3.2906 2.6087 24
3.1844 2.6765 25
3.1672 2.6972 26
3.1023 2.5778 27
3.0528 2.5352 28
2.9885 2.5250 29
2.9455 2.6048 30
2.9025 2.3874 31
2.9228 2.4521 32
2.8160 2.2810 33
2.7895 2.3317 34
2.7372 2.3300 35
2.7494 2.3160 36
2.7219 2.3736 37
2.6818 2.3031 38
2.6464 2.2736 39
2.5834 2.2104 40
2.5779 2.0641 41
2.5577 2.0439 42
2.5212 2.0828 43
2.5029 2.1416 44
2.4391 2.0837 45
2.4556 2.0950 46
2.4138 1.8874 47
2.4138 1.9967 48
2.3698 2.0096 49
2.3776 1.9152 50
2.3011 2.0284 51
2.3454 2.0002 52
2.2767 1.9544 53
2.2332 1.8651 54
2.2900 1.9383 55
2.2442 1.8779 56
2.2183 1.8790 57
2.1824 1.7470 58
2.1648 1.7715 59
2.1859 1.8188 60
2.1529 1.7747 61
2.1343 1.8870 62
2.1344 1.8471 63
2.0876 1.8135 64
2.0775 1.7311 65
2.0557 1.8648 66
2.1017 1.6826 67
2.0649 1.7404 68
2.0505 1.6182 69
2.0084 1.6731 70
2.0143 1.6890 71
1.9882 1.6767 72
1.9759 1.5758 73
1.9800 1.7079 74
1.9602 1.6354 75
1.9580 1.6015 76
1.9401 1.5779 77
1.9070 1.5071 78
1.9304 1.5554 79
1.8987 1.5434 80
1.8927 1.6711 81
1.9044 1.5399 82
1.8664 1.5820 83
1.8860 1.5097 84
1.8043 1.5495 85
1.8571 1.5327 86
1.8285 1.5381 87
1.8367 1.5850 88

Framework versions

  • Transformers 4.18.0
  • TensorFlow 2.8.0
  • Datasets 2.1.0
  • Tokenizers 0.12.1