metadata
license: apache-2.0
base_model: Qwen/Qwen3-Embedding-0.6B
tags:
- qwen3
- text-embeddings-inference
- onnx
- sentence-transformers
- feature-extraction
- sentence-similarity
language:
- multilingual
pipeline_tag: sentence-similarity
library_name: sentence-transformers
Qwen3-Embedding-0.6B ONNX for TEI
This is an ONNX version of Qwen/Qwen3-Embedding-0.6B optimized for Text Embeddings Inference (TEI).
Model Details
- Base Model: Qwen/Qwen3-Embedding-0.6B
- Format: ONNX with external data (
model.onnx
+model.onnx_data
) - Pooling: Mean pooling (built into the ONNX graph)
- Embedding Dimension: 1024
- Max Sequence Length: 32768 tokens
Usage with TEI
docker run --gpus all -p 8080:80 -v $PWD:/data \
ghcr.io/huggingface/text-embeddings-inference:latest \
--model-id janni-t/qwen3-embedding-0.6b-tei-onnx
For CPU inference:
docker run -p 8080:80 -v $PWD:/data \
ghcr.io/huggingface/text-embeddings-inference:cpu-latest \
--model-id janni-t/qwen3-embedding-0.6b-tei-onnx
Conversion Details
This model was converted from the original PyTorch model to ONNX format with:
- Consolidated external data for TEI compatibility
- Mean pooling integrated into the ONNX graph
- Optimized for CPU inference
Original Model
See the original model card at Qwen/Qwen3-Embedding-0.6B for:
- Model architecture details
- Training information
- Benchmark results
- Citation information
License
Apache 2.0 (same as the original model)