v0.32.0

Browse files

See https://github.com/quic/ai-hub-models/releases/v0.32.0 for changelog.

Files changed (4) hide show

.gitattributes +1 -0
DEPLOYMENT_MODEL_LICENSE.pdf +3 -0
LICENSE +2 -0
README.md +38 -34

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+DEPLOYMENT_MODEL_LICENSE.pdf filter=lfs diff=lfs merge=lfs -text

DEPLOYMENT_MODEL_LICENSE.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4409f93b0e82531303b3e10f52f1fdfb56467a25f05b7441c6bbd8bb8a64b42c
+size 109629

LICENSE ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ The license of the original trained model can be found at https://github.com/huggingface/transformers/blob/v4.42.3/LICENSE.
2	+ The license for the deployable model files (.tflite, .onnx, .dlc, .bin, etc.) can be found in DEPLOYMENT_MODEL_LICENSE.pdf.

README.md CHANGED Viewed

@@ -31,35 +31,39 @@ More details on model performance across various devices, can be found
   - Model checkpoint: openai/whisper-large-v3-turbo
   - Input resolution: 128x3000 (30 seconds audio)
   - Max decoded sequence length: 200 tokens
-  - Number of parameters (HfWhisperEncoder): 1.13GB
-  - Model size (HfWhisperEncoder): 391 MB
-  - Number of parameters (HfWhisperDecoder): 480M
-  - Model size (HfWhisperDecoder): 533 MB
 | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
 |---|---|---|---|---|---|---|---|---|
-| HfWhisperEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 2626.809 ms | 1 - 10 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 934.423 ms | 1 - 11 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 2626.809 ms | 1 - 10 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 1207.712 ms | 0 - 13 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 934.423 ms | 1 - 11 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 602.754 ms | 1 - 16 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 602.981 ms | 32 - 47 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 456.176 ms | 1 - 15 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 503.26 ms | 32 - 46 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 793.51 ms | 1 - 1 MB | NPU | Use Export Script |
-| HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 761.052 ms | 1396 - 1396 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 14.86 ms | 28 - 37 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 10.828 ms | 26 - 35 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 14.86 ms | 28 - 37 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 11.464 ms | 26 - 40 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 10.828 ms | 26 - 35 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 8.015 ms | 33 - 52 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 8.43 ms | 42 - 61 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 7.014 ms | 33 - 47 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 7.505 ms | 41 - 54 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 8.521 ms | 33 - 33 MB | NPU | Use Export Script |
-| HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 8.332 ms | 399 - 399 MB | NPU | Use Export Script |
@@ -123,19 +127,19 @@ Profiling Results
 HfWhisperEncoder
 Device                          : cs_8275 (ANDROID 14)
 Runtime                         : QNN_CONTEXT_BINARY
-Estimated inference time (ms)   : 2626.8
-Estimated peak memory usage (MB): [1, 10]
-Total # Ops                     : 5026
-Compute Unit(s)                 : npu (5026 ops) gpu (0 ops) cpu (0 ops)
 ------------------------------------------------------------
 HfWhisperDecoder
 Device                          : cs_8275 (ANDROID 14)
 Runtime                         : QNN_CONTEXT_BINARY
-Estimated inference time (ms)   : 14.9
-Estimated peak memory usage (MB): [28, 37]
-Total # Ops                     : 1213
-Compute Unit(s)                 : npu (1213 ops) gpu (0 ops) cpu (0 ops)
 ```

   - Model checkpoint: openai/whisper-large-v3-turbo
   - Input resolution: 128x3000 (30 seconds audio)
   - Max decoded sequence length: 200 tokens
 | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
 |---|---|---|---|---|---|---|---|---|
+| HfWhisperEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 2627.161 ms | 1 - 11 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_CONTEXT_BINARY | 796.135 ms | 1 - 2 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 946.458 ms | 1 - 9 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 2627.161 ms | 1 - 11 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 1206.877 ms | 0 - 13 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN_CONTEXT_BINARY | 798.568 ms | 1 - 2 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 946.458 ms | 1 - 9 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_CONTEXT_BINARY | 796.747 ms | 1 - 3 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | PRECOMPILED_QNN_ONNX | 779.402 ms | 0 - 1538 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 620.618 ms | 1 - 17 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 601.823 ms | 32 - 47 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 456.277 ms | 1 - 15 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 504.899 ms | 33 - 47 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 797.712 ms | 1 - 1 MB | NPU | Use Export Script |
+| HfWhisperEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 764.858 ms | 1396 - 1396 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 14.841 ms | 22 - 31 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_CONTEXT_BINARY | 10.297 ms | 33 - 36 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 10.809 ms | 27 - 36 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 14.841 ms | 22 - 31 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 11.503 ms | 26 - 41 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN_CONTEXT_BINARY | 10.518 ms | 33 - 41 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 10.809 ms | 27 - 36 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_CONTEXT_BINARY | 10.484 ms | 30 - 32 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | PRECOMPILED_QNN_ONNX | 10.819 ms | 0 - 414 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 8.177 ms | 33 - 51 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 8.449 ms | 42 - 61 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_CONTEXT_BINARY | 7.014 ms | 33 - 48 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | PRECOMPILED_QNN_ONNX | 7.468 ms | 43 - 57 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 8.535 ms | 33 - 33 MB | NPU | Use Export Script |
+| HfWhisperDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 9.155 ms | 399 - 399 MB | NPU | Use Export Script |
 HfWhisperEncoder
 Device                          : cs_8275 (ANDROID 14)
 Runtime                         : QNN_CONTEXT_BINARY
+Estimated inference time (ms)   : 2627.2
+Estimated peak memory usage (MB): [1, 11]
+Total # Ops                     : 5034
+Compute Unit(s)                 : npu (5034 ops) gpu (0 ops) cpu (0 ops)
 ------------------------------------------------------------
 HfWhisperDecoder
 Device                          : cs_8275 (ANDROID 14)
 Runtime                         : QNN_CONTEXT_BINARY
+Estimated inference time (ms)   : 14.8
+Estimated peak memory usage (MB): [22, 31]
+Total # Ops                     : 1222
+Compute Unit(s)                 : npu (1222 ops) gpu (0 ops) cpu (0 ops)
 ```