metadata
license: apache-2.0
datasets:
- HuggingFaceH4/ultrachat_200k
language:
- en
- es
base_model:
- Qwen/Qwen3-Embedding-0.6B
pipeline_tag: feature-extraction
prudant/Qwen3-Embedding-0.6B-W8A8
This is a compressed version of Qwen/Qwen3-Embedding-0.6B using llm-compressor with the following scheme: W8A8
Important: You MUST read the following guide for correct usage of this model here Guide
Model Details
- Original Model: Qwen/Qwen3-Embedding-0.6B
- Quantization Method: GPTQ
- Compression Libraries: llm-compressor
- Calibration Dataset: ultrachat_200k (1024 samples)
- Optimized For: Inference with vLLM
- License: same as original model