janni-t's picture
Upload README.md with huggingface_hub
0c2b7f7 verified
---
license: apache-2.0
base_model: Qwen/Qwen3-Embedding-0.6B
tags:
- qwen3
- text-embeddings-inference
- onnx
- sentence-transformers
- feature-extraction
- sentence-similarity
language:
- multilingual
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---
# Qwen3-Embedding-0.6B ONNX for TEI
This is an ONNX version of [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) optimized for [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference).
## Model Details
- **Base Model**: Qwen/Qwen3-Embedding-0.6B
- **Format**: ONNX with external data (`model.onnx` + `model.onnx_data`)
- **Pooling**: Mean pooling (built into the ONNX graph)
- **Embedding Dimension**: 1024
- **Max Sequence Length**: 32768 tokens
## Usage with TEI
```bash
docker run --gpus all -p 8080:80 -v $PWD:/data \
ghcr.io/huggingface/text-embeddings-inference:latest \
--model-id janni-t/qwen3-embedding-0.6b-tei-onnx
```
For CPU inference:
```bash
docker run -p 8080:80 -v $PWD:/data \
ghcr.io/huggingface/text-embeddings-inference:cpu-latest \
--model-id janni-t/qwen3-embedding-0.6b-tei-onnx
```
## Conversion Details
This model was converted from the original PyTorch model to ONNX format with:
- Consolidated external data for TEI compatibility
- Mean pooling integrated into the ONNX graph
- Optimized for CPU inference
## Original Model
See the original model card at [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) for:
- Model architecture details
- Training information
- Benchmark results
- Citation information
## License
Apache 2.0 (same as the original model)