Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### ✅ Based on `decoder_model_merged.onnx` *with* slimming
↳ ✅ `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ ✅ `int8`: `decoder_model_merged_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_merged_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_merged_bnb4.onnx` (added)
onnx/decoder_model_merged_bnb4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4e9e6e4affb9311fcd06025f9928e8a74f72d878ec1466b3837f68c6173089c6
|
3 |
+
size 39160702
|
onnx/decoder_model_merged_fp16.onnx
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6366361cf39da790286a7fe96ab2af7bcbe9b6dc39aaa6b53c9f8e90e5f13fa0
|
3 |
+
size 119485418
|
onnx/decoder_model_merged_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6e21871668eb31312532444f3f07a950ddb47c53b0b480d1614a36cdd93c450a
|
3 |
+
size 64675320
|
onnx/decoder_model_merged_q4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b04e270a668741cbdd5cd58eb6710945b8450bce30802b5e23031cb13c0001c8
|
3 |
+
size 42788125
|
onnx/decoder_model_merged_q4f16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2a8418e21c7d74348c1eed9bbc744fe14937238ccf33eac5dfaeae9d5fe9090e
|
3 |
+
size 36111051
|
onnx/decoder_model_merged_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7038765cd1f9b29c8ac8fa0306b27f3f604e784bbf8c91efec10aa7c47388234
|
3 |
+
size 64675353
|