Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### ✅ Based on `decoder_model_merged.onnx` *with* slimming
↳ ✅ `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ ✅ `int8`: `decoder_model_merged_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_merged_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_merged_bnb4.onnx` (added)
onnx/decoder_model_merged_bnb4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ed79868842e62023ea4b9c76dd9ef82df0aa2b46e28cb8a57f2fcbde445ffae7
|
3 |
+
size 89735909
|
onnx/decoder_model_merged_fp16.onnx
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e91c9d4526b592ee3d47927d4228b938ac5539b6b98864b93960ec580aa2832e
|
3 |
+
size 116622030
|
onnx/decoder_model_merged_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e8bdb22d3fa93ebb9e53710613a6e35ebbb148b974e859f3c888345ee81afb74
|
3 |
+
size 58766935
|
onnx/decoder_model_merged_q4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a321ac2f6a5484ef394d748490f97a1b357f4fcc8408ab9c0695a63b746b289a
|
3 |
+
size 92335654
|
onnx/decoder_model_merged_q4f16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:83cc586d2ceeb4d4ad61931b56c0f73b45c336182f0b32f6bedbf0b595b8223e
|
3 |
+
size 56823126
|
onnx/decoder_model_merged_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c5ce214e9130587f6b49bf97ba5216577bcbd8eff4c7e115b835182318d7dde8
|
3 |
+
size 58766987
|