Add/update the quantized ONNX model files and README.md for Transformers.js v3

#13
by whitphx HF Staff - opened

Applied Quantizations

✅ Based on encodec_decode.onnx with slimming

↳ ✅ q4f16: encodec_decode_q4f16.onnx (added)

✅ Based on encodec_decode.onnx with slimming

↳ ✅ q4f16: encodec_decode_q4f16.onnx (added)

✅ Based on decoder_model.onnx with slimming

↳ ✅ q4f16: decoder_model_q4f16.onnx (added)

✅ Based on decoder_model.onnx with slimming

↳ ✅ q4f16: decoder_model_q4f16.onnx (added)

✅ Based on text_encoder.onnx with slimming

↳ ✅ q4f16: text_encoder_q4f16.onnx (added)

✅ Based on text_encoder.onnx with slimming

↳ ✅ q4f16: text_encoder_q4f16.onnx (added)

✅ Based on decoder_with_past_model.onnx with slimming

↳ ✅ q4f16: decoder_with_past_model_q4f16.onnx (added)

✅ Based on decoder_with_past_model.onnx with slimming

↳ ✅ q4f16: decoder_with_past_model_q4f16.onnx (added)

✅ Based on build_delay_pattern_mask.onnx with slimming

↳ ✅ q4f16: build_delay_pattern_mask_q4f16.onnx (added)

✅ Based on build_delay_pattern_mask.onnx with slimming

↳ ✅ q4f16: build_delay_pattern_mask_q4f16.onnx (added)

Xenova changed pull request status to merged

Sign up or log in to comment