deb101 commited on
Commit
844befc
·
verified ·
1 Parent(s): b5508e7

Model save

Browse files
README.md CHANGED
@@ -16,35 +16,35 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - F1 Micro: 0.0374
20
- - F1 Macro: 0.0013
21
- - Precision At 5: 0.2765
22
- - Recall At 5: 0.1166
23
- - Precision At 8: 0.2353
24
- - Recall At 8: 0.1441
25
- - Precision At 15: 0.1534
26
- - Recall At 15: 0.1748
27
- - Rare F1 Micro: 0.0115
28
- - Rare F1 Macro: 0.0007
29
- - Rare Precision: 0.3415
30
- - Rare Recall: 0.0059
31
- - Rare Precision At 5: 0.2324
32
- - Rare Recall At 5: 0.0985
33
- - Rare Precision At 8: 0.2004
34
- - Rare Recall At 8: 0.1223
35
- - Rare Precision At 15: 0.1265
36
- - Rare Recall At 15: 0.1500
37
- - Not Rare F1 Micro: 0.5
38
- - Not Rare F1 Macro: 0.5
39
- - Not Rare Precision: 0.5
40
- - Not Rare Recall: 0.5
41
- - Not Rare Precision At 5: 0.0809
42
- - Not Rare Recall At 5: 0.4044
43
- - Not Rare Precision At 8: 0.0506
44
- - Not Rare Recall At 8: 0.4044
45
- - Not Rare Precision At 15: 0.0270
46
- - Not Rare Recall At 15: 0.4044
47
- - Loss: 0.1015
48
 
49
  ## Model description
50
 
@@ -71,21 +71,15 @@ The following hyperparameters were used during training:
71
  - total_train_batch_size: 32
72
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
73
  - lr_scheduler_type: linear
74
- - lr_scheduler_warmup_steps: 50
75
- - num_epochs: 7
76
  - mixed_precision_training: Native AMP
77
 
78
  ### Training results
79
 
80
  | Training Loss | Epoch | Step | F1 Micro | F1 Macro | Precision At 5 | Recall At 5 | Precision At 8 | Recall At 8 | Precision At 15 | Recall At 15 | Rare F1 Micro | Rare F1 Macro | Rare Precision | Rare Recall | Rare Precision At 5 | Rare Recall At 5 | Rare Precision At 8 | Rare Recall At 8 | Rare Precision At 15 | Rare Recall At 15 | Not Rare F1 Micro | Not Rare F1 Macro | Not Rare Precision | Not Rare Recall | Not Rare Precision At 5 | Not Rare Recall At 5 | Not Rare Precision At 8 | Not Rare Recall At 8 | Not Rare Precision At 15 | Not Rare Recall At 15 | Validation Loss |
81
  |:-------------:|:------:|:----:|:--------:|:--------:|:--------------:|:-----------:|:--------------:|:-----------:|:---------------:|:------------:|:-------------:|:-------------:|:--------------:|:-----------:|:-------------------:|:----------------:|:-------------------:|:----------------:|:--------------------:|:-----------------:|:-----------------:|:-----------------:|:------------------:|:---------------:|:-----------------------:|:--------------------:|:-----------------------:|:--------------------:|:------------------------:|:---------------------:|:---------------:|
82
- | 0.0989 | 1.0 | 18 | 0.0 | 0.0 | 0.2721 | 0.1084 | 0.1958 | 0.1246 | 0.1270 | 0.1470 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2088 | 0.0885 | 0.1507 | 0.1015 | 0.0985 | 0.1201 | 0.5956 | 0.3733 | 0.5956 | 0.5956 | 0.0809 | 0.4044 | 0.0506 | 0.4044 | 0.0270 | 0.4044 | 0.1039 |
83
- | 0.1013 | 2.0 | 36 | 0.0 | 0.0 | 0.2706 | 0.1117 | 0.2307 | 0.1530 | 0.1485 | 0.1725 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2279 | 0.0919 | 0.1912 | 0.1342 | 0.1270 | 0.1506 | 0.5956 | 0.3733 | 0.5956 | 0.5956 | 0.0809 | 0.4044 | 0.0506 | 0.4044 | 0.0270 | 0.4044 | 0.1054 |
84
- | 0.1012 | 3.0 | 54 | 0.0 | 0.0 | 0.2765 | 0.1116 | 0.2371 | 0.1512 | 0.1480 | 0.1715 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2471 | 0.1051 | 0.1930 | 0.1354 | 0.125 | 0.1493 | 0.5956 | 0.3733 | 0.5956 | 0.5956 | 0.0809 | 0.4044 | 0.0506 | 0.4044 | 0.0270 | 0.4044 | 0.1033 |
85
- | 0.1003 | 4.0 | 72 | 0.0 | 0.0 | 0.2765 | 0.1166 | 0.2353 | 0.1441 | 0.1495 | 0.1722 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2456 | 0.1075 | 0.1939 | 0.1366 | 0.125 | 0.1491 | 0.5956 | 0.3733 | 0.5956 | 0.5956 | 0.0809 | 0.4044 | 0.0506 | 0.4044 | 0.0270 | 0.4044 | 0.1017 |
86
- | 0.0872 | 5.0 | 90 | 0.0238 | 0.0010 | 0.2765 | 0.1166 | 0.2353 | 0.1441 | 0.1534 | 0.1748 | 0.0033 | 0.0005 | 0.4 | 0.0017 | 0.2456 | 0.1075 | 0.2004 | 0.1223 | 0.1265 | 0.1500 | 0.5074 | 0.4995 | 0.5074 | 0.5074 | 0.0809 | 0.4044 | 0.0506 | 0.4044 | 0.0270 | 0.4044 | 0.1042 |
87
- | 0.0891 | 6.0 | 108 | 0.0417 | 0.0013 | 0.2765 | 0.1166 | 0.2353 | 0.1441 | 0.1534 | 0.1748 | 0.0131 | 0.0006 | 0.3077 | 0.0067 | 0.2368 | 0.1012 | 0.2004 | 0.1223 | 0.1265 | 0.1500 | 0.5074 | 0.5060 | 0.5074 | 0.5074 | 0.0809 | 0.4044 | 0.0506 | 0.4044 | 0.0270 | 0.4044 | 0.1016 |
88
- | 0.0844 | 6.6197 | 119 | 0.0374 | 0.0013 | 0.2765 | 0.1166 | 0.2353 | 0.1441 | 0.1534 | 0.1748 | 0.0115 | 0.0007 | 0.3415 | 0.0059 | 0.2324 | 0.0985 | 0.2004 | 0.1223 | 0.1265 | 0.1500 | 0.5 | 0.5 | 0.5 | 0.5 | 0.0809 | 0.4044 | 0.0506 | 0.4044 | 0.0270 | 0.4044 | 0.1015 |
89
 
90
 
91
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - F1 Micro: 0.0
20
+ - F1 Macro: 0.0
21
+ - Precision At 5: 0.2749
22
+ - Recall At 5: 0.0637
23
+ - Precision At 8: 0.2540
24
+ - Recall At 8: 0.0909
25
+ - Precision At 15: 0.1905
26
+ - Recall At 15: 0.1224
27
+ - Rare F1 Micro: 0.0
28
+ - Rare F1 Macro: 0.0
29
+ - Rare Precision: 0.0
30
+ - Rare Recall: 0.0
31
+ - Rare Precision At 5: 0.0037
32
+ - Rare Recall At 5: 0.0013
33
+ - Rare Precision At 8: 0.0043
34
+ - Rare Recall At 8: 0.0023
35
+ - Rare Precision At 15: 0.0049
36
+ - Rare Recall At 15: 0.0048
37
+ - Not Rare F1 Micro: 0.0
38
+ - Not Rare F1 Macro: 0.0
39
+ - Not Rare Precision: 0.0
40
+ - Not Rare Recall: 0.0
41
+ - Not Rare Precision At 5: 0.2742
42
+ - Not Rare Recall At 5: 0.1680
43
+ - Not Rare Precision At 8: 0.2540
44
+ - Not Rare Recall At 8: 0.2396
45
+ - Not Rare Precision At 15: 0.1906
46
+ - Not Rare Recall At 15: 0.3248
47
+ - Loss: 0.0209
48
 
49
  ## Model description
50
 
 
71
  - total_train_batch_size: 32
72
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
73
  - lr_scheduler_type: linear
74
+ - lr_scheduler_warmup_steps: 500
75
+ - num_epochs: 1
76
  - mixed_precision_training: Native AMP
77
 
78
  ### Training results
79
 
80
  | Training Loss | Epoch | Step | F1 Micro | F1 Macro | Precision At 5 | Recall At 5 | Precision At 8 | Recall At 8 | Precision At 15 | Recall At 15 | Rare F1 Micro | Rare F1 Macro | Rare Precision | Rare Recall | Rare Precision At 5 | Rare Recall At 5 | Rare Precision At 8 | Rare Recall At 8 | Rare Precision At 15 | Rare Recall At 15 | Not Rare F1 Micro | Not Rare F1 Macro | Not Rare Precision | Not Rare Recall | Not Rare Precision At 5 | Not Rare Recall At 5 | Not Rare Precision At 8 | Not Rare Recall At 8 | Not Rare Precision At 15 | Not Rare Recall At 15 | Validation Loss |
81
  |:-------------:|:------:|:----:|:--------:|:--------:|:--------------:|:-----------:|:--------------:|:-----------:|:---------------:|:------------:|:-------------:|:-------------:|:--------------:|:-----------:|:-------------------:|:----------------:|:-------------------:|:----------------:|:--------------------:|:-----------------:|:-----------------:|:-----------------:|:------------------:|:---------------:|:-----------------------:|:--------------------:|:-----------------------:|:--------------------:|:------------------------:|:---------------------:|:---------------:|
82
+ | 0.023 | 0.9981 | 262 | 0.0 | 0.0 | 0.2749 | 0.0637 | 0.2540 | 0.0909 | 0.1905 | 0.1224 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0037 | 0.0013 | 0.0043 | 0.0023 | 0.0049 | 0.0048 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2742 | 0.1680 | 0.2540 | 0.2396 | 0.1906 | 0.3248 | 0.0209 |
 
 
 
 
 
 
83
 
84
 
85
  ### Framework versions
config.json CHANGED
@@ -22,7 +22,7 @@
22
  "num_attention_heads": 32,
23
  "num_hidden_layers": 32,
24
  "num_key_value_heads": 8,
25
- "num_labels": 833,
26
  "num_layers_to_unfreeze": 0,
27
  "pooling_output_size": 128,
28
  "pooling_type": "adaptive_avg_pool1d",
 
22
  "num_attention_heads": 32,
23
  "num_hidden_layers": 32,
24
  "num_key_value_heads": 8,
25
+ "num_labels": 7942,
26
  "num_layers_to_unfreeze": 0,
27
  "pooling_output_size": 128,
28
  "pooling_type": "adaptive_avg_pool1d",
eval_loss_plot.png CHANGED
eval_precision_at_15_plot.png CHANGED
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:80e9a1be816fc17f7e39f7860a590d224f9b92383c0e7e451cbbad0710ac6d6b
3
- size 4475046623
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:80d1e9d271f98d5c0a9fe96f241ea75ca08e4da791ce23029772df4e845c4be8
3
+ size 4824468367
train_loss_plot.png CHANGED
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cf48cc1852558bb553881924e6a3097c41896b995050f5a4cf1cf2a9bd2206cf
3
  size 5496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2dd9c1f6ab1ba09652359171d70300a5e24b37eaed13268307c961fe7ff4b071
3
  size 5496