v0.31.0
Browse filesSee https://github.com/quic/ai-hub-models/releases/v0.31.0 for changelog.
README.md
CHANGED
|
@@ -38,18 +38,18 @@ More details on model performance across various devices, can be found
|
|
| 38 |
|
| 39 |
| Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
|
| 40 |
|---|---|---|---|---|---|---|---|---|
|
| 41 |
-
| TextEncoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile |
|
| 42 |
-
| TextEncoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile |
|
| 43 |
-
| TextEncoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) |
|
| 44 |
-
| UNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile |
|
| 45 |
-
| UNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile |
|
| 46 |
-
| UNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) |
|
| 47 |
-
| VAEDecoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile |
|
| 48 |
-
| VAEDecoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile |
|
| 49 |
-
| VAEDecoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) |
|
| 50 |
-
| ControlNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile |
|
| 51 |
-
| ControlNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile |
|
| 52 |
-
| ControlNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) |
|
| 53 |
|
| 54 |
|
| 55 |
|
|
@@ -112,7 +112,7 @@ Profiling Results
|
|
| 112 |
------------------------------------------------------------
|
| 113 |
TextEncoder_Quantized
|
| 114 |
Device : cs_8_gen_2 (ANDROID 13)
|
| 115 |
-
Runtime :
|
| 116 |
Estimated inference time (ms) : 10.9
|
| 117 |
Estimated peak memory usage (MB): [0, 3]
|
| 118 |
Total # Ops : 569
|
|
@@ -121,7 +121,7 @@ Compute Unit(s) : npu (569 ops) gpu (0 ops) cpu (0 ops)
|
|
| 121 |
------------------------------------------------------------
|
| 122 |
UNet_Quantized
|
| 123 |
Device : cs_8_gen_2 (ANDROID 13)
|
| 124 |
-
Runtime :
|
| 125 |
Estimated inference time (ms) : 258.2
|
| 126 |
Estimated peak memory usage (MB): [13, 15]
|
| 127 |
Total # Ops : 5433
|
|
@@ -130,7 +130,7 @@ Compute Unit(s) : npu (5433 ops) gpu (0 ops) cpu (0 ops)
|
|
| 130 |
------------------------------------------------------------
|
| 131 |
VAEDecoder_Quantized
|
| 132 |
Device : cs_8_gen_2 (ANDROID 13)
|
| 133 |
-
Runtime :
|
| 134 |
Estimated inference time (ms) : 397.6
|
| 135 |
Estimated peak memory usage (MB): [0, 2]
|
| 136 |
Total # Ops : 408
|
|
@@ -139,7 +139,7 @@ Compute Unit(s) : npu (408 ops) gpu (0 ops) cpu (0 ops)
|
|
| 139 |
------------------------------------------------------------
|
| 140 |
ControlNet_Quantized
|
| 141 |
Device : cs_8_gen_2 (ANDROID 13)
|
| 142 |
-
Runtime :
|
| 143 |
Estimated inference time (ms) : 104.7
|
| 144 |
Estimated peak memory usage (MB): [2, 9]
|
| 145 |
Total # Ops : 2405
|
|
|
|
| 38 |
|
| 39 |
| Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
|
| 40 |
|---|---|---|---|---|---|---|---|---|
|
| 41 |
+
| TextEncoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 10.874 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 42 |
+
| TextEncoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 7.918 ms | 0 - 18 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 43 |
+
| TextEncoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 10.875 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 44 |
+
| UNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 258.151 ms | 13 - 15 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 45 |
+
| UNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 197.629 ms | 13 - 31 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 46 |
+
| UNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 256.936 ms | 13 - 16 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 47 |
+
| VAEDecoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 397.625 ms | 0 - 2 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 48 |
+
| VAEDecoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 300.627 ms | 0 - 21 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 49 |
+
| VAEDecoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 395.006 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 50 |
+
| ControlNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 104.668 ms | 2 - 9 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 51 |
+
| ControlNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 77.289 ms | 2 - 23 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 52 |
+
| ControlNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 103.817 ms | 2 - 5 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
| 53 |
|
| 54 |
|
| 55 |
|
|
|
|
| 112 |
------------------------------------------------------------
|
| 113 |
TextEncoder_Quantized
|
| 114 |
Device : cs_8_gen_2 (ANDROID 13)
|
| 115 |
+
Runtime : QNN_DLC
|
| 116 |
Estimated inference time (ms) : 10.9
|
| 117 |
Estimated peak memory usage (MB): [0, 3]
|
| 118 |
Total # Ops : 569
|
|
|
|
| 121 |
------------------------------------------------------------
|
| 122 |
UNet_Quantized
|
| 123 |
Device : cs_8_gen_2 (ANDROID 13)
|
| 124 |
+
Runtime : QNN_DLC
|
| 125 |
Estimated inference time (ms) : 258.2
|
| 126 |
Estimated peak memory usage (MB): [13, 15]
|
| 127 |
Total # Ops : 5433
|
|
|
|
| 130 |
------------------------------------------------------------
|
| 131 |
VAEDecoder_Quantized
|
| 132 |
Device : cs_8_gen_2 (ANDROID 13)
|
| 133 |
+
Runtime : QNN_DLC
|
| 134 |
Estimated inference time (ms) : 397.6
|
| 135 |
Estimated peak memory usage (MB): [0, 2]
|
| 136 |
Total # Ops : 408
|
|
|
|
| 139 |
------------------------------------------------------------
|
| 140 |
ControlNet_Quantized
|
| 141 |
Device : cs_8_gen_2 (ANDROID 13)
|
| 142 |
+
Runtime : QNN_DLC
|
| 143 |
Estimated inference time (ms) : 104.7
|
| 144 |
Estimated peak memory usage (MB): [2, 9]
|
| 145 |
Total # Ops : 2405
|