qaihm-bot commited on
Commit
d0ab6e0
·
verified ·
1 Parent(s): ffb4f3c

See https://github.com/quic/ai-hub-models/releases/v0.31.0 for changelog.

Files changed (1) hide show
  1. README.md +28 -34
README.md CHANGED
@@ -38,34 +38,28 @@ More details on model performance across various devices, can be found
38
 
39
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
40
  |---|---|---|---|---|---|---|---|---|
41
- | HfWhisperEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 15161.314 ms | 0 - 8 MB | NPU | Use Export Script |
42
- | HfWhisperEncoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 797.627 ms | 1 - 4 MB | NPU | Use Export Script |
43
- | HfWhisperEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 938.632 ms | 1 - 8 MB | NPU | Use Export Script |
44
- | HfWhisperEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN | 15161.314 ms | 0 - 8 MB | NPU | Use Export Script |
45
- | HfWhisperEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN | 1205.304 ms | 1 - 10 MB | NPU | Use Export Script |
46
- | HfWhisperEncoder | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 794.787 ms | 1 - 3 MB | NPU | Use Export Script |
47
- | HfWhisperEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN | 938.632 ms | 1 - 8 MB | NPU | Use Export Script |
48
- | HfWhisperEncoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 803.939 ms | 1 - 3 MB | NPU | Use Export Script |
49
- | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 599.929 ms | 1 - 17 MB | NPU | Use Export Script |
50
- | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 621.736 ms | 32 - 47 MB | NPU | [Whisper-Large-V3-Turbo.onnx](https://huggingface.co/qualcomm/Whisper-Large-V3-Turbo/blob/main/Whisper-Large-V3-Turbo.onnx) |
51
- | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 458.188 ms | 1 - 15 MB | NPU | Use Export Script |
52
- | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 500.984 ms | 34 - 49 MB | NPU | [Whisper-Large-V3-Turbo.onnx](https://huggingface.co/qualcomm/Whisper-Large-V3-Turbo/blob/main/Whisper-Large-V3-Turbo.onnx) |
53
- | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 796.706 ms | 1 - 1 MB | NPU | Use Export Script |
54
- | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 763.115 ms | 1396 - 1396 MB | NPU | [Whisper-Large-V3-Turbo.onnx](https://huggingface.co/qualcomm/Whisper-Large-V3-Turbo/blob/main/Whisper-Large-V3-Turbo.onnx) |
55
- | HfWhisperDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 72.072 ms | 30 - 38 MB | NPU | Use Export Script |
56
- | HfWhisperDecoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 10.134 ms | 33 - 36 MB | NPU | Use Export Script |
57
- | HfWhisperDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 10.932 ms | 21 - 30 MB | NPU | Use Export Script |
58
- | HfWhisperDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN | 72.072 ms | 30 - 38 MB | NPU | Use Export Script |
59
- | HfWhisperDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN | 11.327 ms | 26 - 40 MB | NPU | Use Export Script |
60
- | HfWhisperDecoder | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 10.597 ms | 33 - 35 MB | NPU | Use Export Script |
61
- | HfWhisperDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN | 10.932 ms | 21 - 30 MB | NPU | Use Export Script |
62
- | HfWhisperDecoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 10.426 ms | 31 - 33 MB | NPU | Use Export Script |
63
- | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 8.147 ms | 33 - 52 MB | NPU | Use Export Script |
64
- | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 8.362 ms | 17 - 36 MB | NPU | [Whisper-Large-V3-Turbo.onnx](https://huggingface.co/qualcomm/Whisper-Large-V3-Turbo/blob/main/Whisper-Large-V3-Turbo.onnx) |
65
- | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 6.987 ms | 33 - 48 MB | NPU | Use Export Script |
66
- | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 7.53 ms | 43 - 56 MB | NPU | [Whisper-Large-V3-Turbo.onnx](https://huggingface.co/qualcomm/Whisper-Large-V3-Turbo/blob/main/Whisper-Large-V3-Turbo.onnx) |
67
- | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 8.565 ms | 33 - 33 MB | NPU | Use Export Script |
68
- | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 8.356 ms | 400 - 400 MB | NPU | [Whisper-Large-V3-Turbo.onnx](https://huggingface.co/qualcomm/Whisper-Large-V3-Turbo/blob/main/Whisper-Large-V3-Turbo.onnx) |
69
 
70
 
71
 
@@ -128,18 +122,18 @@ Profiling Results
128
  ------------------------------------------------------------
129
  HfWhisperEncoder
130
  Device : cs_8275 (ANDROID 14)
131
- Runtime : QNN
132
- Estimated inference time (ms) : 15161.3
133
- Estimated peak memory usage (MB): [0, 8]
134
  Total # Ops : 5026
135
  Compute Unit(s) : npu (5026 ops) gpu (0 ops) cpu (0 ops)
136
 
137
  ------------------------------------------------------------
138
  HfWhisperDecoder
139
  Device : cs_8275 (ANDROID 14)
140
- Runtime : QNN
141
- Estimated inference time (ms) : 72.1
142
- Estimated peak memory usage (MB): [30, 38]
143
  Total # Ops : 1213
144
  Compute Unit(s) : npu (1213 ops) gpu (0 ops) cpu (0 ops)
145
  ```
 
38
 
39
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
40
  |---|---|---|---|---|---|---|---|---|
41
+ | HfWhisperEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 2626.809 ms | 1 - 10 MB | NPU | Use Export Script |
42
+ | HfWhisperEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 934.423 ms | 1 - 11 MB | NPU | Use Export Script |
43
+ | HfWhisperEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 2626.809 ms | 1 - 10 MB | NPU | Use Export Script |
44
+ | HfWhisperEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 1207.712 ms | 0 - 13 MB | NPU | Use Export Script |
45
+ | HfWhisperEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 934.423 ms | 1 - 11 MB | NPU | Use Export Script |
46
+ | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 602.754 ms | 1 - 16 MB | NPU | Use Export Script |
47
+ | HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 602.981 ms | 32 - 47 MB | NPU | Use Export Script |
48
+ | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 456.176 ms | 1 - 15 MB | NPU | Use Export Script |
49
+ | HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 503.26 ms | 32 - 46 MB | NPU | Use Export Script |
50
+ | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 793.51 ms | 1 - 1 MB | NPU | Use Export Script |
51
+ | HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 761.052 ms | 1396 - 1396 MB | NPU | Use Export Script |
52
+ | HfWhisperDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 14.86 ms | 28 - 37 MB | NPU | Use Export Script |
53
+ | HfWhisperDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 10.828 ms | 26 - 35 MB | NPU | Use Export Script |
54
+ | HfWhisperDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 14.86 ms | 28 - 37 MB | NPU | Use Export Script |
55
+ | HfWhisperDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 11.464 ms | 26 - 40 MB | NPU | Use Export Script |
56
+ | HfWhisperDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 10.828 ms | 26 - 35 MB | NPU | Use Export Script |
57
+ | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 8.015 ms | 33 - 52 MB | NPU | Use Export Script |
58
+ | HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 8.43 ms | 42 - 61 MB | NPU | Use Export Script |
59
+ | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 7.014 ms | 33 - 47 MB | NPU | Use Export Script |
60
+ | HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 7.505 ms | 41 - 54 MB | NPU | Use Export Script |
61
+ | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 8.521 ms | 33 - 33 MB | NPU | Use Export Script |
62
+ | HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 8.332 ms | 399 - 399 MB | NPU | Use Export Script |
 
 
 
 
 
 
63
 
64
 
65
 
 
122
  ------------------------------------------------------------
123
  HfWhisperEncoder
124
  Device : cs_8275 (ANDROID 14)
125
+ Runtime : QNN_CONTEXT_BINARY
126
+ Estimated inference time (ms) : 2626.8
127
+ Estimated peak memory usage (MB): [1, 10]
128
  Total # Ops : 5026
129
  Compute Unit(s) : npu (5026 ops) gpu (0 ops) cpu (0 ops)
130
 
131
  ------------------------------------------------------------
132
  HfWhisperDecoder
133
  Device : cs_8275 (ANDROID 14)
134
+ Runtime : QNN_CONTEXT_BINARY
135
+ Estimated inference time (ms) : 14.9
136
+ Estimated peak memory usage (MB): [28, 37]
137
  Total # Ops : 1213
138
  Compute Unit(s) : npu (1213 ops) gpu (0 ops) cpu (0 ops)
139
  ```