qaihm-bot commited on
Commit
80deda0
·
verified ·
1 Parent(s): d0ab6e0

See https://github.com/quic/ai-hub-models/releases/v0.32.0 for changelog.

Files changed (4) hide show
  1. .gitattributes +1 -0
  2. DEPLOYMENT_MODEL_LICENSE.pdf +3 -0
  3. LICENSE +2 -0
  4. README.md +38 -34
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ DEPLOYMENT_MODEL_LICENSE.pdf filter=lfs diff=lfs merge=lfs -text
DEPLOYMENT_MODEL_LICENSE.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4409f93b0e82531303b3e10f52f1fdfb56467a25f05b7441c6bbd8bb8a64b42c
3
+ size 109629
LICENSE ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ The license of the original trained model can be found at https://github.com/huggingface/transformers/blob/v4.42.3/LICENSE.
2
+ The license for the deployable model files (.tflite, .onnx, .dlc, .bin, etc.) can be found in DEPLOYMENT_MODEL_LICENSE.pdf.
README.md CHANGED
@@ -31,35 +31,39 @@ More details on model performance across various devices, can be found
31
  - Model checkpoint: openai/whisper-large-v3-turbo
32
  - Input resolution: 128x3000 (30 seconds audio)
33
  - Max decoded sequence length: 200 tokens
34
- - Number of parameters (HfWhisperEncoder): 1.13GB
35
- - Model size (HfWhisperEncoder): 391 MB
36
- - Number of parameters (HfWhisperDecoder): 480M
37
- - Model size (HfWhisperDecoder): 533 MB
38
 
39
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
40
  |---|---|---|---|---|---|---|---|---|
41
- | HfWhisperEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 2626.809 ms | 1 - 10 MB | NPU | Use Export Script |
42
- | HfWhisperEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 934.423 ms | 1 - 11 MB | NPU | Use Export Script |
43
- | HfWhisperEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 2626.809 ms | 1 - 10 MB | NPU | Use Export Script |
44
- | HfWhisperEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 1207.712 ms | 0 - 13 MB | NPU | Use Export Script |
45
- | HfWhisperEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 934.423 ms | 1 - 11 MB | NPU | Use Export Script |
46
- | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 602.754 ms | 1 - 16 MB | NPU | Use Export Script |
47
- | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 602.981 ms | 32 - 47 MB | NPU | Use Export Script |
48
- | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 456.176 ms | 1 - 15 MB | NPU | Use Export Script |
49
- | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 503.26 ms | 32 - 46 MB | NPU | Use Export Script |
50
- | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 793.51 ms | 1 - 1 MB | NPU | Use Export Script |
51
- | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 761.052 ms | 1396 - 1396 MB | NPU | Use Export Script |
52
- | HfWhisperDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 14.86 ms | 28 - 37 MB | NPU | Use Export Script |
53
- | HfWhisperDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 10.828 ms | 26 - 35 MB | NPU | Use Export Script |
54
- | HfWhisperDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 14.86 ms | 28 - 37 MB | NPU | Use Export Script |
55
- | HfWhisperDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 11.464 ms | 26 - 40 MB | NPU | Use Export Script |
56
- | HfWhisperDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 10.828 ms | 26 - 35 MB | NPU | Use Export Script |
57
- | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 8.015 ms | 33 - 52 MB | NPU | Use Export Script |
58
- | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 8.43 ms | 42 - 61 MB | NPU | Use Export Script |
59
- | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 7.014 ms | 33 - 47 MB | NPU | Use Export Script |
60
- | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 7.505 ms | 41 - 54 MB | NPU | Use Export Script |
61
- | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 8.521 ms | 33 - 33 MB | NPU | Use Export Script |
62
- | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 8.332 ms | 399 - 399 MB | NPU | Use Export Script |
 
 
 
 
 
 
 
 
63
 
64
 
65
 
@@ -123,19 +127,19 @@ Profiling Results
123
  HfWhisperEncoder
124
  Device : cs_8275 (ANDROID 14)
125
  Runtime : QNN_CONTEXT_BINARY
126
- Estimated inference time (ms) : 2626.8
127
- Estimated peak memory usage (MB): [1, 10]
128
- Total # Ops : 5026
129
- Compute Unit(s) : npu (5026 ops) gpu (0 ops) cpu (0 ops)
130
 
131
  ------------------------------------------------------------
132
  HfWhisperDecoder
133
  Device : cs_8275 (ANDROID 14)
134
  Runtime : QNN_CONTEXT_BINARY
135
- Estimated inference time (ms) : 14.9
136
- Estimated peak memory usage (MB): [28, 37]
137
- Total # Ops : 1213
138
- Compute Unit(s) : npu (1213 ops) gpu (0 ops) cpu (0 ops)
139
  ```
140
 
141
 
 
31
  - Model checkpoint: openai/whisper-large-v3-turbo
32
  - Input resolution: 128x3000 (30 seconds audio)
33
  - Max decoded sequence length: 200 tokens
 
 
 
 
34
 
35
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
36
  |---|---|---|---|---|---|---|---|---|
37
+ | HfWhisperEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 2627.161 ms | 1 - 11 MB | NPU | Use Export Script |
38
+ | HfWhisperEncoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_CONTEXT_BINARY | 796.135 ms | 1 - 2 MB | NPU | Use Export Script |
39
+ | HfWhisperEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 946.458 ms | 1 - 9 MB | NPU | Use Export Script |
40
+ | HfWhisperEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 2627.161 ms | 1 - 11 MB | NPU | Use Export Script |
41
+ | HfWhisperEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 1206.877 ms | 0 - 13 MB | NPU | Use Export Script |
42
+ | HfWhisperEncoder | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN_CONTEXT_BINARY | 798.568 ms | 1 - 2 MB | NPU | Use Export Script |
43
+ | HfWhisperEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 946.458 ms | 1 - 9 MB | NPU | Use Export Script |
44
+ | HfWhisperEncoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_CONTEXT_BINARY | 796.747 ms | 1 - 3 MB | NPU | Use Export Script |
45
+ | HfWhisperEncoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | PRECOMPILED_QNN_ONNX | 779.402 ms | 0 - 1538 MB | NPU | Use Export Script |
46
+ | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 620.618 ms | 1 - 17 MB | NPU | Use Export Script |
47
+ | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 601.823 ms | 32 - 47 MB | NPU | Use Export Script |
48
+ | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 456.277 ms | 1 - 15 MB | NPU | Use Export Script |
49
+ | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 504.899 ms | 33 - 47 MB | NPU | Use Export Script |
50
+ | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 797.712 ms | 1 - 1 MB | NPU | Use Export Script |
51
+ | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 764.858 ms | 1396 - 1396 MB | NPU | Use Export Script |
52
+ | HfWhisperDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 14.841 ms | 22 - 31 MB | NPU | Use Export Script |
53
+ | HfWhisperDecoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_CONTEXT_BINARY | 10.297 ms | 33 - 36 MB | NPU | Use Export Script |
54
+ | HfWhisperDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 10.809 ms | 27 - 36 MB | NPU | Use Export Script |
55
+ | HfWhisperDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 14.841 ms | 22 - 31 MB | NPU | Use Export Script |
56
+ | HfWhisperDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 11.503 ms | 26 - 41 MB | NPU | Use Export Script |
57
+ | HfWhisperDecoder | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN_CONTEXT_BINARY | 10.518 ms | 33 - 41 MB | NPU | Use Export Script |
58
+ | HfWhisperDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 10.809 ms | 27 - 36 MB | NPU | Use Export Script |
59
+ | HfWhisperDecoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_CONTEXT_BINARY | 10.484 ms | 30 - 32 MB | NPU | Use Export Script |
60
+ | HfWhisperDecoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | PRECOMPILED_QNN_ONNX | 10.819 ms | 0 - 414 MB | NPU | Use Export Script |
61
+ | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 8.177 ms | 33 - 51 MB | NPU | Use Export Script |
62
+ | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 8.449 ms | 42 - 61 MB | NPU | Use Export Script |
63
+ | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 7.014 ms | 33 - 48 MB | NPU | Use Export Script |
64
+ | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 7.468 ms | 43 - 57 MB | NPU | Use Export Script |
65
+ | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 8.535 ms | 33 - 33 MB | NPU | Use Export Script |
66
+ | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 9.155 ms | 399 - 399 MB | NPU | Use Export Script |
67
 
68
 
69
 
 
127
  HfWhisperEncoder
128
  Device : cs_8275 (ANDROID 14)
129
  Runtime : QNN_CONTEXT_BINARY
130
+ Estimated inference time (ms) : 2627.2
131
+ Estimated peak memory usage (MB): [1, 11]
132
+ Total # Ops : 5034
133
+ Compute Unit(s) : npu (5034 ops) gpu (0 ops) cpu (0 ops)
134
 
135
  ------------------------------------------------------------
136
  HfWhisperDecoder
137
  Device : cs_8275 (ANDROID 14)
138
  Runtime : QNN_CONTEXT_BINARY
139
+ Estimated inference time (ms) : 14.8
140
+ Estimated peak memory usage (MB): [22, 31]
141
+ Total # Ops : 1222
142
+ Compute Unit(s) : npu (1222 ops) gpu (0 ops) cpu (0 ops)
143
  ```
144
 
145