lapp0 commited on
Commit
37ed7e9
·
verified ·
1 Parent(s): 519d300

End of training

Browse files
Files changed (1) hide show
  1. README.md +8 -11
README.md CHANGED
@@ -16,13 +16,9 @@ This student model is distilled from the teacher model [gpt2](https://huggingfac
16
  The [Distily](https://github.com/lapp0/distily) library was used for this distillation.
17
 
18
  It achieves the following results on the evaluation set:
19
- - eval_enwikippl: 16455.1230
20
- - eval_frwikippl: 38444.9648
21
- - eval_zhwikippl: 56717.4922
22
- - eval_loss: 0.0004
23
- - eval_runtime: 0.0554
24
- - eval_samples_per_second: 18.066
25
- - eval_steps_per_second: 18.066
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
28
  should probably proofread and complete it, then remove this comment.
@@ -57,14 +53,15 @@ The following hyperparameters were used during training:
57
  - num_epochs: 1.0
58
 
59
  ### Resource Usage
60
- Peak GPU Memory: 1.2452 GB
61
 
62
  ### Model Results
63
  | epoch | step | eval_enwikippl | eval_frwikippl | eval_loss | eval_runtime | eval_samples_per_second | eval_steps_per_second | eval_zhwikippl |
64
  | --- | --- | --- | --- | --- | --- | --- | --- | --- |
65
- | 0 | 0 | 63012.375 | 58568.7617 | 0.0042 | 0.076 | 13.155 | 13.155 | 62696.3008 |
66
- | 0.4040 | 40 | 20128.3281 | 41006.9219 | 0.0004 | 0.0553 | 18.079 | 18.079 | 58574.4609 |
67
- | 0.8081 | 80 | 16455.1230 | 38444.9648 | 0.0004 | 0.0554 | 18.066 | 18.066 | 56717.4922 |
 
68
 
69
  ### Framework versions
70
  - Distily 0.1.0
 
16
  The [Distily](https://github.com/lapp0/distily) library was used for this distillation.
17
 
18
  It achieves the following results on the evaluation set:
19
+ - eval_enwikippl: 30.2266
20
+ - eval_frwikippl: 57.3005
21
+ - eval_zhwikippl: 18.1903
 
 
 
 
22
 
23
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
24
  should probably proofread and complete it, then remove this comment.
 
53
  - num_epochs: 1.0
54
 
55
  ### Resource Usage
56
+ Peak GPU Memory: 1.2453 GB
57
 
58
  ### Model Results
59
  | epoch | step | eval_enwikippl | eval_frwikippl | eval_loss | eval_runtime | eval_samples_per_second | eval_steps_per_second | eval_zhwikippl |
60
  | --- | --- | --- | --- | --- | --- | --- | --- | --- |
61
+ | | teacher | 30.2266 | 57.3005 | | | | | 18.1903 |
62
+ | 0 | 0 | 53288.7773 | 55702.1719 | 0.0041 | 0.0758 | 13.185 | 13.185 | 55025.875 |
63
+ | 0.4040 | 40 | 20265.3535 | 39300.7383 | 0.0004 | 0.0554 | 18.059 | 18.059 | 53151.6875 |
64
+ | 0.8081 | 80 | 17527.1328 | 38131.125 | 0.0004 | 0.0553 | 18.096 | 18.096 | 51728.4688 |
65
 
66
  ### Framework versions
67
  - Distily 0.1.0