qaihm-bot commited on
Commit
af8b4f7
·
verified ·
1 Parent(s): a475c0f

See https://github.com/quic/ai-hub-models/releases/v0.34.0 for changelog.

README.md CHANGED
@@ -24,6 +24,7 @@ More details on model performance across various devices, can be found
24
  [here](https://aihub.qualcomm.com/models/whisper_large_v3_turbo).
25
 
26
 
 
27
  ### Model Details
28
 
29
  - **Model Type:** Model_use_case.speech_recognition
@@ -34,31 +35,31 @@ More details on model performance across various devices, can be found
34
 
35
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
36
  |---|---|---|---|---|---|---|---|---|
37
- | HfWhisperEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 2626.892 ms | 1 - 10 MB | NPU | Use Export Script |
38
- | HfWhisperEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 936.893 ms | 1 - 8 MB | NPU | Use Export Script |
39
- | HfWhisperEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 2626.892 ms | 1 - 10 MB | NPU | Use Export Script |
 
40
  | HfWhisperEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 1206.877 ms | 0 - 13 MB | NPU | Use Export Script |
41
- | HfWhisperEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 936.893 ms | 1 - 8 MB | NPU | Use Export Script |
42
- | HfWhisperEncoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_CONTEXT_BINARY | 798.032 ms | 1 - 3 MB | NPU | Use Export Script |
43
  | HfWhisperEncoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | PRECOMPILED_QNN_ONNX | 790.301 ms | 0 - 1538 MB | NPU | Use Export Script |
44
- | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 611.415 ms | 1 - 17 MB | NPU | Use Export Script |
45
  | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 615.66 ms | 33 - 49 MB | NPU | Use Export Script |
46
- | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 456.954 ms | 1 - 15 MB | NPU | Use Export Script |
47
  | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 454.477 ms | 62 - 77 MB | NPU | Use Export Script |
48
- | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 795.627 ms | 1 - 1 MB | NPU | Use Export Script |
49
  | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 765.822 ms | 1396 - 1396 MB | NPU | Use Export Script |
50
- | HfWhisperDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 15.033 ms | 24 - 32 MB | NPU | Use Export Script |
51
- | HfWhisperDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 10.786 ms | 25 - 34 MB | NPU | Use Export Script |
52
- | HfWhisperDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 15.033 ms | 24 - 32 MB | NPU | Use Export Script |
 
53
  | HfWhisperDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 11.503 ms | 26 - 41 MB | NPU | Use Export Script |
54
- | HfWhisperDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 10.786 ms | 25 - 34 MB | NPU | Use Export Script |
55
- | HfWhisperDecoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_CONTEXT_BINARY | 10.508 ms | 33 - 37 MB | NPU | Use Export Script |
56
  | HfWhisperDecoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | PRECOMPILED_QNN_ONNX | 10.559 ms | 0 - 415 MB | NPU | Use Export Script |
57
- | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 8.122 ms | 33 - 52 MB | NPU | Use Export Script |
58
  | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 8.646 ms | 44 - 63 MB | NPU | Use Export Script |
59
- | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 6.955 ms | 33 - 47 MB | NPU | Use Export Script |
60
  | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 7.464 ms | 42 - 56 MB | NPU | Use Export Script |
61
- | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 8.696 ms | 33 - 33 MB | NPU | Use Export Script |
62
  | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 8.462 ms | 399 - 399 MB | NPU | Use Export Script |
63
 
64
 
@@ -117,26 +118,7 @@ device. This script does the following:
117
  ```bash
118
  python -m qai_hub_models.models.whisper_large_v3_turbo.export
119
  ```
120
- ```
121
- Profiling Results
122
- ------------------------------------------------------------
123
- HfWhisperEncoder
124
- Device : cs_8275 (ANDROID 14)
125
- Runtime : QNN_CONTEXT_BINARY
126
- Estimated inference time (ms) : 2626.9
127
- Estimated peak memory usage (MB): [1, 10]
128
- Total # Ops : 5034
129
- Compute Unit(s) : npu (5034 ops) gpu (0 ops) cpu (0 ops)
130
-
131
- ------------------------------------------------------------
132
- HfWhisperDecoder
133
- Device : cs_8275 (ANDROID 14)
134
- Runtime : QNN_CONTEXT_BINARY
135
- Estimated inference time (ms) : 15.0
136
- Estimated peak memory usage (MB): [24, 32]
137
- Total # Ops : 1222
138
- Compute Unit(s) : npu (1222 ops) gpu (0 ops) cpu (0 ops)
139
- ```
140
 
141
 
142
  ## How does this work?
 
24
  [here](https://aihub.qualcomm.com/models/whisper_large_v3_turbo).
25
 
26
 
27
+
28
  ### Model Details
29
 
30
  - **Model Type:** Model_use_case.speech_recognition
 
35
 
36
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
37
  |---|---|---|---|---|---|---|---|---|
38
+ | HfWhisperEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 2632.222 ms | 1 - 12 MB | NPU | Use Export Script |
39
+ | HfWhisperEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 933.37 ms | 1 - 9 MB | NPU | Use Export Script |
40
+ | HfWhisperEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 2632.222 ms | 1 - 12 MB | NPU | Use Export Script |
41
+ | HfWhisperEncoder | float | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN_CONTEXT_BINARY | 796.32 ms | 1 - 3 MB | NPU | Use Export Script |
42
  | HfWhisperEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 1206.877 ms | 0 - 13 MB | NPU | Use Export Script |
43
+ | HfWhisperEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 933.37 ms | 1 - 9 MB | NPU | Use Export Script |
 
44
  | HfWhisperEncoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | PRECOMPILED_QNN_ONNX | 790.301 ms | 0 - 1538 MB | NPU | Use Export Script |
45
+ | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 615.261 ms | 2 - 17 MB | NPU | Use Export Script |
46
  | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 615.66 ms | 33 - 49 MB | NPU | Use Export Script |
47
+ | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 512.087 ms | 1 - 15 MB | NPU | Use Export Script |
48
  | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 454.477 ms | 62 - 77 MB | NPU | Use Export Script |
49
+ | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 797.024 ms | 1 - 1 MB | NPU | Use Export Script |
50
  | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 765.822 ms | 1396 - 1396 MB | NPU | Use Export Script |
51
+ | HfWhisperDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 14.646 ms | 18 - 28 MB | NPU | Use Export Script |
52
+ | HfWhisperDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 10.999 ms | 32 - 42 MB | NPU | Use Export Script |
53
+ | HfWhisperDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 14.646 ms | 18 - 28 MB | NPU | Use Export Script |
54
+ | HfWhisperDecoder | float | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN_CONTEXT_BINARY | 10.68 ms | 33 - 36 MB | NPU | Use Export Script |
55
  | HfWhisperDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 11.503 ms | 26 - 41 MB | NPU | Use Export Script |
56
+ | HfWhisperDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 10.999 ms | 32 - 42 MB | NPU | Use Export Script |
 
57
  | HfWhisperDecoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | PRECOMPILED_QNN_ONNX | 10.559 ms | 0 - 415 MB | NPU | Use Export Script |
58
+ | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 8.142 ms | 33 - 52 MB | NPU | Use Export Script |
59
  | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 8.646 ms | 44 - 63 MB | NPU | Use Export Script |
60
+ | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 6.991 ms | 33 - 47 MB | NPU | Use Export Script |
61
  | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 7.464 ms | 42 - 56 MB | NPU | Use Export Script |
62
+ | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 8.741 ms | 33 - 33 MB | NPU | Use Export Script |
63
  | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 8.462 ms | 399 - 399 MB | NPU | Use Export Script |
64
 
65
 
 
118
  ```bash
119
  python -m qai_hub_models.models.whisper_large_v3_turbo.export
120
  ```
121
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
 
123
 
124
  ## How does this work?
precompiled/qualcomm-qcs8275-proxy/Whisper-Large-V3-Turbo_HfWhisperEncoder.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fdf8a5ade13cadc9d4ab2aa9f53e37d1ff5d7707d5fb16bdf437c6666a51c131
3
  size 1460963448
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3e0fabf8deaab7a92f89778dde08ffd7ec3841efbc40117e2d79a133d6080f43
3
  size 1460963448
precompiled/qualcomm-qcs8275-proxy/sdk_versions.yml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
precompiled/qualcomm-qcs9075-proxy/Whisper-Large-V3-Turbo_HfWhisperEncoder.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b74d2b3957b08f729b7d874184983c07f7400a6ccbd284eba782a4d8d771721e
3
  size 1460787256
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c10c39c0def44bebacd121e5a82e9cfd0ce7469b997c501525521fe094675d21
3
  size 1460787256
precompiled/qualcomm-qcs9075-proxy/sdk_versions.yml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
precompiled/qualcomm-sa7255p/Whisper-Large-V3-Turbo_HfWhisperEncoder.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fdf8a5ade13cadc9d4ab2aa9f53e37d1ff5d7707d5fb16bdf437c6666a51c131
3
  size 1460963448
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3e0fabf8deaab7a92f89778dde08ffd7ec3841efbc40117e2d79a133d6080f43
3
  size 1460963448
precompiled/qualcomm-sa7255p/sdk_versions.yml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
precompiled/qualcomm-sa8255p-proxy/Whisper-Large-V3-Turbo_HfWhisperDecoder.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db6e94f11970bad25616ff0d941b0f99db6b96fbe7d32b4823da53a6ce2a496d
3
+ size 452212632
precompiled/{qualcomm-snapdragon-8gen2 → qualcomm-sa8255p-proxy}/Whisper-Large-V3-Turbo_HfWhisperEncoder.bin RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3158cb3a7edb47d64a407126d270cec85da7b8597e416b996587b2794de0da5b
3
  size 1460951088
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d46c88257a1d67e9bd59daeb8b0539e260f6ff4c5aeff5a8de6a674d1bc4bac
3
  size 1460951088
precompiled/qualcomm-sa8255p-proxy/sdk_versions.yml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
precompiled/qualcomm-sa8295p/sdk_versions.yml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
precompiled/qualcomm-sa8775p/Whisper-Large-V3-Turbo_HfWhisperEncoder.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b74d2b3957b08f729b7d874184983c07f7400a6ccbd284eba782a4d8d771721e
3
  size 1460787256
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c10c39c0def44bebacd121e5a82e9cfd0ce7469b997c501525521fe094675d21
3
  size 1460787256
precompiled/qualcomm-sa8775p/sdk_versions.yml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
precompiled/qualcomm-snapdragon-8-elite/Whisper-Large-V3-Turbo_HfWhisperEncoder.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6f501f5315a02c81bfd73d795f3cb16b0f5cd87168d72de159ccf977bee083d3
3
  size 1452460080
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ecab51184af0c6485a48daf8aa82c52b8f4009214c1a4184e70641c645a94d1f
3
  size 1452460080
precompiled/qualcomm-snapdragon-8-elite/sdk_versions.yml ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
4
+ precompiled_qnn_onnx:
5
+ qairt: 2.33.2.250410134701_117956
precompiled/qualcomm-snapdragon-8gen2/sdk_versions.yml ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ sdk_versions:
2
+ precompiled_qnn_onnx:
3
+ qairt: 2.33.2.250410134701_117956
4
+ qnn_context_binary:
5
+ qairt: 2.34.2.250528164111_119506
precompiled/qualcomm-snapdragon-8gen3/Whisper-Large-V3-Turbo_HfWhisperEncoder.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:32f7866dda76d2ee73096122aba9557c3cccbe3fac994040bee2258a022cebaa
3
  size 1461004336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c6bfa5ffd917152b9e74627eaa8626c87b72c6d0e36077afa12a8fc7eab10bba
3
  size 1461004336
precompiled/qualcomm-snapdragon-8gen3/sdk_versions.yml ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
4
+ precompiled_qnn_onnx:
5
+ qairt: 2.33.2.250410134701_117956
precompiled/qualcomm-snapdragon-x-elite/Whisper-Large-V3-Turbo_HfWhisperEncoder.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1a8215ebf95fa5c73347e567489aa597d64bf4bf3e2e2ed12ecd5b1f62e70e9b
3
  size 1460996144
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a9f2e4719b1c9443d524c7d61473b754788cf2c3c7778cd522f189528a96c03
3
  size 1460996144
precompiled/qualcomm-snapdragon-x-elite/sdk_versions.yml ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
4
+ precompiled_qnn_onnx:
5
+ qairt: 2.33.2.250410134701_117956