CheeseES commited on
Commit
e42e7af
·
verified ·
1 Parent(s): a4adb46

Fine-tuned openai/whisper-small on multilingual dataset

Browse files
README.md CHANGED
@@ -4,31 +4,43 @@ language:
4
  - ms
5
  - zh
6
  - en
7
- license: mit
8
- base_model: openai/whisper-large-v3-turbo
9
  tags:
 
 
 
10
  - generated_from_trainer
11
  datasets:
12
  - CheeseES/LLM_FINE_TUNING_1
 
 
13
  model-index:
14
- - name: Whisper_Large_V3_Turbo_Tune
15
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
 
18
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
19
  should probably proofread and complete it, then remove this comment. -->
20
 
21
- # Whisper_Large_V3_Turbo_Tune
22
 
23
- This model is a fine-tuned version of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) on the LLM Fine Tuning Dataset dataset.
24
  It achieves the following results on the evaluation set:
25
- - eval_loss: 0.0558
26
- - eval_wer: 48.2143
27
- - eval_runtime: 216.6063
28
- - eval_samples_per_second: 1.08
29
- - eval_steps_per_second: 0.139
30
- - epoch: 10.2564
31
- - step: 1200
32
 
33
  ## Model description
34
 
@@ -52,11 +64,35 @@ The following hyperparameters were used during training:
52
  - eval_batch_size: 8
53
  - seed: 33
54
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
55
- - lr_scheduler_type: cosine
56
  - lr_scheduler_warmup_steps: 300
57
  - training_steps: 3000
58
  - mixed_precision_training: Native AMP
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  ### Framework versions
61
 
62
  - PEFT 0.15.2
 
4
  - ms
5
  - zh
6
  - en
7
+ license: apache-2.0
8
+ base_model: openai/whisper-small
9
  tags:
10
+ - whisper
11
+ - multilingual
12
+ - speech-recognition
13
  - generated_from_trainer
14
  datasets:
15
  - CheeseES/LLM_FINE_TUNING_1
16
+ metrics:
17
+ - wer
18
  model-index:
19
+ - name: Whisper_FT_V1
20
+ results:
21
+ - task:
22
+ type: automatic-speech-recognition
23
+ name: Automatic Speech Recognition
24
+ dataset:
25
+ name: LLM Fine Tuning Dataset
26
+ type: CheeseES/LLM_FINE_TUNING_1
27
+ split: None
28
+ args: language
29
+ metrics:
30
+ - type: wer
31
+ value: 51.61904761904762
32
+ name: Wer
33
  ---
34
 
35
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
36
  should probably proofread and complete it, then remove this comment. -->
37
 
38
+ # Whisper_FT_V1
39
 
40
+ This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the LLM Fine Tuning Dataset dataset.
41
  It achieves the following results on the evaluation set:
42
+ - Loss: 0.0892
43
+ - Wer: 51.6190
 
 
 
 
 
44
 
45
  ## Model description
46
 
 
64
  - eval_batch_size: 8
65
  - seed: 33
66
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
67
+ - lr_scheduler_type: linear
68
  - lr_scheduler_warmup_steps: 300
69
  - training_steps: 3000
70
  - mixed_precision_training: Native AMP
71
 
72
+ ### Training results
73
+
74
+ | Training Loss | Epoch | Step | Validation Loss | Wer |
75
+ |:-------------:|:-------:|:----:|:---------------:|:-------:|
76
+ | 0.5149 | 0.8547 | 100 | 0.2080 | 80.4524 |
77
+ | 0.2 | 1.7094 | 200 | 0.1883 | 82.9286 |
78
+ | 0.1807 | 2.5641 | 300 | 0.1701 | 84.7143 |
79
+ | 0.1561 | 3.4188 | 400 | 0.1553 | 82.6667 |
80
+ | 0.1363 | 4.2735 | 500 | 0.1458 | 75.3571 |
81
+ | 0.1152 | 5.1282 | 600 | 0.1367 | 71.3095 |
82
+ | 0.0994 | 5.9829 | 700 | 0.1284 | 68.6190 |
83
+ | 0.0865 | 6.8376 | 800 | 0.1214 | 64.5238 |
84
+ | 0.073 | 7.6923 | 900 | 0.1136 | 69.5714 |
85
+ | 0.0656 | 8.5470 | 1000 | 0.1091 | 66.6905 |
86
+ | 0.0598 | 9.4017 | 1100 | 0.1049 | 69.8810 |
87
+ | 0.0512 | 10.2564 | 1200 | 0.1025 | 65.0 |
88
+ | 0.0481 | 11.1111 | 1300 | 0.0977 | 64.8571 |
89
+ | 0.0429 | 11.9658 | 1400 | 0.0955 | 59.5238 |
90
+ | 0.0385 | 12.8205 | 1500 | 0.0930 | 61.3810 |
91
+ | 0.0338 | 13.6752 | 1600 | 0.0916 | 65.3810 |
92
+ | 0.0334 | 14.5299 | 1700 | 0.0905 | 63.0952 |
93
+ | 0.0298 | 15.3846 | 1800 | 0.0892 | 51.6190 |
94
+
95
+
96
  ### Framework versions
97
 
98
  - PEFT 0.15.2
adapter_config.json CHANGED
@@ -4,7 +4,7 @@
4
  "base_model_class": "WhisperForConditionalGeneration",
5
  "parent_library": "transformers.models.whisper.modeling_whisper"
6
  },
7
- "base_model_name_or_path": "openai/whisper-large-v3-turbo",
8
  "bias": "none",
9
  "corda_config": null,
10
  "eva_config": null,
@@ -27,12 +27,12 @@
27
  "rank_pattern": {},
28
  "revision": null,
29
  "target_modules": [
30
- "fc1",
31
- "q_proj",
32
  "fc2",
33
  "out_proj",
34
  "v_proj",
35
- "k_proj"
 
 
36
  ],
37
  "task_type": null,
38
  "trainable_token_indices": null,
 
4
  "base_model_class": "WhisperForConditionalGeneration",
5
  "parent_library": "transformers.models.whisper.modeling_whisper"
6
  },
7
+ "base_model_name_or_path": "openai/whisper-small",
8
  "bias": "none",
9
  "corda_config": null,
10
  "eva_config": null,
 
27
  "rank_pattern": {},
28
  "revision": null,
29
  "target_modules": [
 
 
30
  "fc2",
31
  "out_proj",
32
  "v_proj",
33
+ "k_proj",
34
+ "fc1",
35
+ "q_proj"
36
  ],
37
  "task_type": null,
38
  "trainable_token_indices": null,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cd18fdec9de19ed00700f9f8f913bc329f56cc8f6a4eb806c88b96bef3fe9aad
3
- size 27916528
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:df5f188bcadd61ac856942ff0342f2c605ff5feae5ff4d49616acb054f07d7ab
3
+ size 13028552
preprocessor_config.json CHANGED
@@ -2,7 +2,7 @@
2
  "chunk_length": 30,
3
  "dither": 0.0,
4
  "feature_extractor_type": "WhisperFeatureExtractor",
5
- "feature_size": 128,
6
  "hop_length": 160,
7
  "n_fft": 400,
8
  "n_samples": 480000,
 
2
  "chunk_length": 30,
3
  "dither": 0.0,
4
  "feature_extractor_type": "WhisperFeatureExtractor",
5
+ "feature_size": 80,
6
  "hop_length": 160,
7
  "n_fft": 400,
8
  "n_samples": 480000,
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:43d1137e7a3b20a887997470a211a01efcce3749d35275a32598a5ab41a125d4
3
- size 5841
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0269ac6e49a474c3f8b6ccc7ef18792747d434a732d842fcb264e60e7bcf2aee
3
+ size 8081