Update README.md
Browse files
README.md
CHANGED
@@ -42,14 +42,14 @@ Latest version as of Feb 24, 2024:
|
|
42 |
Extract the two .zip files directly into the llama.cpp folder you just git cloned. Overwrite files as prompted.
|
43 |
|
44 |
## Download Mixtral
|
45 |
-
Download the full-blast version of the model by downloading all .safetensors, .json, and .model files to D:\Mixtral
|
46 |
-
https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
|
47 |
-
|
48 |
|
49 |
## Convert the model to fp16:
|
50 |
-
D:\llama.cpp>python convert.py D:\Mixtral --outtype f16 --outfile D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin
|
51 |
|
52 |
## Quantize the fp16 model to q5_k_m:
|
53 |
-
D:\llama.cpp>quantize.exe D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.q5_k_m.gguf q5_k_m
|
54 |
|
55 |
That's it. Load up the resulting .gguf file like you normally would.
|
|
|
42 |
Extract the two .zip files directly into the llama.cpp folder you just git cloned. Overwrite files as prompted.
|
43 |
|
44 |
## Download Mixtral
|
45 |
+
* Download the full-blast version of the model by downloading all .safetensors, .json, and .model files to D:\Mixtral\:
|
46 |
+
* https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
|
47 |
+
|
48 |
|
49 |
## Convert the model to fp16:
|
50 |
+
* D:\llama.cpp>python convert.py D:\Mixtral --outtype f16 --outfile D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin
|
51 |
|
52 |
## Quantize the fp16 model to q5_k_m:
|
53 |
+
* D:\llama.cpp>quantize.exe D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.q5_k_m.gguf q5_k_m
|
54 |
|
55 |
That's it. Load up the resulting .gguf file like you normally would.
|