whitphx HF Staff commited on
Commit
9f592da
·
verified ·
1 Parent(s): 1a537cb

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### ✅ Based on `decoder_model_merged.onnx` *with* slimming

↳ ✅ `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ ✅ `int8`: `decoder_model_merged_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_merged_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

onnx/decoder_model_merged_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44bcf2271579150f424bad74897c184761541141a1e283e97108ce4ae8ae7e57
3
+ size 177088140
onnx/decoder_model_merged_fp16.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5ef723757452d012127c33135a01461f889a7b8b91816ec40bec9fc4ebe71685
3
- size 326176733
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e8197a9d82409572640be301cf1915062f4813b09fd4a6cca0f9e6c4880f946
3
+ size 325925282
onnx/decoder_model_merged_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:247759b22643c7b0d4a9bc6ecdc83a1f435d5a9c835bdc6850244060cce58112
3
+ size 163663156
onnx/decoder_model_merged_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd8166a321b18cc1c2eb1640ef192e7f5838f540f6c1df8b6f6beada9c301e15
3
+ size 185706225
onnx/decoder_model_merged_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a9821e0ff226e31076f367eb4130eb4cb9152ae37d2201eb8b83cfd81b82c23
3
+ size 127699375
onnx/decoder_model_merged_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d049d9da98bfc51f588ecc22718ebcbe6647e03be8bae27f829c8739f33a162d
3
+ size 163663223