zeroshot's picture
Update README.md
11c9a09
|
raw
history blame
1.6 kB
metadata
license: mit
language:
  - en

This is the quantized (INT8) ONNX variant of the bge-large-en-v1.5 model for embeddings created with DeepSparse Optimum for ONNX export/inference pipeline and Neural Magic's Sparsify for One-Shot quantization.

Model achieves 100% accuracy recovery on the STSB validation dataset vs. dense ONNX variant.

Other sparse and quantized bge ONNX models:

zeroshot/bge-large-en-v1.5-sparse

zeroshot/bge-base-en-v1.5-sparse

zeroshot/bge-base-en-v1.5-quant

zeroshot/bge-small-en-v1.5-sparse

zeroshot/bge-small-en-v1.5-quant