metadata

license: apache-2.0
datasets:
  - HuggingFaceH4/ultrachat_200k
language:
  - en
  - es
base_model:
  - Qwen/Qwen3-Embedding-0.6B
pipeline_tag: feature-extraction

prudant/Qwen3-Embedding-0.6B-W8A8

This is a compressed version of Qwen/Qwen3-Embedding-0.6B using llm-compressor with the following scheme: W8A8

Important: You MUST read the following guide for correct usage of this model here Guide

Model Details

Original Model: Qwen/Qwen3-Embedding-0.6B
Quantization Method: GPTQ
Compression Libraries: llm-compressor
Calibration Dataset: ultrachat_200k (1024 samples)
Optimized For: Inference with vLLM
License: same as original model