|
--- |
|
license: apache-2.0 |
|
base_model: Qwen/Qwen3-Embedding-0.6B |
|
tags: |
|
- qwen3 |
|
- text-embeddings-inference |
|
- onnx |
|
- sentence-transformers |
|
- feature-extraction |
|
- sentence-similarity |
|
language: |
|
- multilingual |
|
pipeline_tag: sentence-similarity |
|
library_name: sentence-transformers |
|
--- |
|
|
|
# Qwen3-Embedding-0.6B ONNX for TEI |
|
|
|
This is an ONNX version of [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) optimized for [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference). |
|
|
|
## Model Details |
|
|
|
- **Base Model**: Qwen/Qwen3-Embedding-0.6B |
|
- **Format**: ONNX with external data (`model.onnx` + `model.onnx_data`) |
|
- **Pooling**: Mean pooling (built into the ONNX graph) |
|
- **Embedding Dimension**: 1024 |
|
- **Max Sequence Length**: 32768 tokens |
|
|
|
## Usage with TEI |
|
|
|
```bash |
|
docker run --gpus all -p 8080:80 -v $PWD:/data \ |
|
ghcr.io/huggingface/text-embeddings-inference:latest \ |
|
--model-id janni-t/qwen3-embedding-0.6b-tei-onnx |
|
``` |
|
|
|
For CPU inference: |
|
```bash |
|
docker run -p 8080:80 -v $PWD:/data \ |
|
ghcr.io/huggingface/text-embeddings-inference:cpu-latest \ |
|
--model-id janni-t/qwen3-embedding-0.6b-tei-onnx |
|
``` |
|
|
|
## Conversion Details |
|
|
|
This model was converted from the original PyTorch model to ONNX format with: |
|
- Consolidated external data for TEI compatibility |
|
- Mean pooling integrated into the ONNX graph |
|
- Optimized for CPU inference |
|
|
|
## Original Model |
|
|
|
See the original model card at [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) for: |
|
- Model architecture details |
|
- Training information |
|
- Benchmark results |
|
- Citation information |
|
|
|
## License |
|
|
|
Apache 2.0 (same as the original model) |