QuantTrio
/

Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix

@@ -18,7 +18,7 @@ Base model [Qwen/Qwen3-Coder-480B-A35B-Instruct](https://huggingface.co/Qwen/Qwe
 ### 【VLLM Launch Command for 8 GPUs (Single Node)】
-<i>注: Note: When launching with 8 GPUs, --enable-expert-parallel must be specified; otherwise, the expert tensors cannot be evenly split across tensor parallel ranks. This option is not required for 4-GPU setups. </i>
 ```
 CONTEXT_LENGTH=32768  # 262144
@@ -46,6 +46,9 @@ vllm>=0.9.2
 ### 【Model Update History】
 ```
 2025-08-11
 1.Upload tokenizer_config.json

 ### 【VLLM Launch Command for 8 GPUs (Single Node)】
+<i>Note: Note: When launching with 8 GPUs, --enable-expert-parallel must be specified; otherwise, the expert tensors cannot be evenly split across tensor parallel ranks. This option is not required for 4-GPU setups. </i>
 ```
 CONTEXT_LENGTH=32768  # 262144
 ### 【Model Update History】
 ```
+2025-08-19
+1.[BugFix] Fix compatibility issues with vLLM 0.10.1
 2025-08-11
 1.Upload tokenizer_config.json