Update README.md
Browse files
README.md
CHANGED
|
@@ -8,6 +8,13 @@ license: mit
|
|
| 8 |
[Phi4-mini](https://huggingface.co/microsoft/Phi-4-mini-instruct) model quantized with [torchao](https://huggingface.co/docs/transformers/main/en/quantization/torchao) float8 dynamic activation and float8 weight quantization (per row granularity), by PyTorch team.
|
| 9 |
|
| 10 |
# Quantization Recipe
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
We used following code to get the quantized model:
|
| 12 |
|
| 13 |
```
|
|
|
|
| 8 |
[Phi4-mini](https://huggingface.co/microsoft/Phi-4-mini-instruct) model quantized with [torchao](https://huggingface.co/docs/transformers/main/en/quantization/torchao) float8 dynamic activation and float8 weight quantization (per row granularity), by PyTorch team.
|
| 9 |
|
| 10 |
# Quantization Recipe
|
| 11 |
+
|
| 12 |
+
Installation:
|
| 13 |
+
```
|
| 14 |
+
pip install transformers
|
| 15 |
+
pip install --pre torchao --index-url https://download.pytorch.org/whl/nightly/cu126
|
| 16 |
+
```
|
| 17 |
+
|
| 18 |
We used following code to get the quantized model:
|
| 19 |
|
| 20 |
```
|