Add/update the quantized ONNX model files and README.md for Transformers.js v3

by whitphx HF Staff - opened 2 days ago

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

+17

-2

whitphx

2 days ago

Applied Quantizations

✅ Based on `decoder_model_merged.onnx` with slimming

↳ ✅ fp16: decoder_model_merged_fp16.onnx (replaced because it was invalid)
↳ ✅ int8: decoder_model_merged_int8.onnx (added)
↳ ✅ uint8: decoder_model_merged_uint8.onnx (added)
↳ ✅ q4: decoder_model_merged_q4.onnx (added)
↳ ✅ q4f16: decoder_model_merged_q4f16.onnx (added)
↳ ✅ bnb4: decoder_model_merged_bnb4.onnx (added)

Add/update the quantized ONNX model files and README.md for Transformers.js v379f5ef38

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Applied Quantizations

✅ Based on decoder_model_merged.onnx with slimming

✅ Based on `decoder_model_merged.onnx` with slimming