linzhao-amd commited on
Commit
71bfc2f
·
verified ·
1 Parent(s): bd72b03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -20,7 +20,7 @@ The model is the quantized version of the Meta-Llama 3.1-405B-Instruct model, wh
20
 
21
  # Model Quantization
22
 
23
- This model was obtained by quantizing weights and activations of [Meta-Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct) to MXFP4 and KV cache to FP8, using AutoSmoothQuant algorithm in AMD-Quark.
24
 
25
  **Quantization scripts:**
26
  ```
 
20
 
21
  # Model Quantization
22
 
23
+ This model was obtained by quantizing weights and activations of [Meta-Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct) to MXFP4 and KV cache to FP8, using AutoSmoothQuant algorithm in [AMD-Quark](https://quark.docs.amd.com/latest/index.html).
24
 
25
  **Quantization scripts:**
26
  ```