End of training
Browse files
README.md
CHANGED
@@ -16,13 +16,9 @@ This student model is distilled from the teacher model [gpt2](https://huggingfac
|
|
16 |
The [Distily](https://github.com/lapp0/distily) library was used for this distillation.
|
17 |
|
18 |
It achieves the following results on the evaluation set:
|
19 |
-
- eval_enwikippl:
|
20 |
-
- eval_frwikippl:
|
21 |
-
- eval_zhwikippl:
|
22 |
-
- eval_loss: 0.0004
|
23 |
-
- eval_runtime: 0.0554
|
24 |
-
- eval_samples_per_second: 18.066
|
25 |
-
- eval_steps_per_second: 18.066
|
26 |
|
27 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
28 |
should probably proofread and complete it, then remove this comment.
|
@@ -57,14 +53,15 @@ The following hyperparameters were used during training:
|
|
57 |
- num_epochs: 1.0
|
58 |
|
59 |
### Resource Usage
|
60 |
-
Peak GPU Memory: 1.
|
61 |
|
62 |
### Model Results
|
63 |
| epoch | step | eval_enwikippl | eval_frwikippl | eval_loss | eval_runtime | eval_samples_per_second | eval_steps_per_second | eval_zhwikippl |
|
64 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
65 |
-
|
|
66 |
-
| 0
|
67 |
-
| 0.
|
|
|
68 |
|
69 |
### Framework versions
|
70 |
- Distily 0.1.0
|
|
|
16 |
The [Distily](https://github.com/lapp0/distily) library was used for this distillation.
|
17 |
|
18 |
It achieves the following results on the evaluation set:
|
19 |
+
- eval_enwikippl: 30.2266
|
20 |
+
- eval_frwikippl: 57.3005
|
21 |
+
- eval_zhwikippl: 18.1903
|
|
|
|
|
|
|
|
|
22 |
|
23 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
24 |
should probably proofread and complete it, then remove this comment.
|
|
|
53 |
- num_epochs: 1.0
|
54 |
|
55 |
### Resource Usage
|
56 |
+
Peak GPU Memory: 1.2453 GB
|
57 |
|
58 |
### Model Results
|
59 |
| epoch | step | eval_enwikippl | eval_frwikippl | eval_loss | eval_runtime | eval_samples_per_second | eval_steps_per_second | eval_zhwikippl |
|
60 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
61 |
+
| | teacher | 30.2266 | 57.3005 | | | | | 18.1903 |
|
62 |
+
| 0 | 0 | 53288.7773 | 55702.1719 | 0.0041 | 0.0758 | 13.185 | 13.185 | 55025.875 |
|
63 |
+
| 0.4040 | 40 | 20265.3535 | 39300.7383 | 0.0004 | 0.0554 | 18.059 | 18.059 | 53151.6875 |
|
64 |
+
| 0.8081 | 80 | 17527.1328 | 38131.125 | 0.0004 | 0.0553 | 18.096 | 18.096 | 51728.4688 |
|
65 |
|
66 |
### Framework versions
|
67 |
- Distily 0.1.0
|