OptimizeLLM
/

Mixtral-8x7B-Instruct-v0.1.q5_k_m

Model card Files Files and versions

OptimizeLLM commited on Feb 24, 2024

Commit

3007636

·

verified ·

1 Parent(s): 0711a5e

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -42,14 +42,14 @@ Latest version as of Feb 24, 2024:
 Extract the two .zip files directly into the llama.cpp folder you just git cloned. Overwrite files as prompted.
 ## Download Mixtral
-Download the full-blast version of the model by downloading all .safetensors, .json, and .model files to D:\Mixtral\
-https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
-Download all .safetensors, .json, and .model files to D:\Mixtral\
 ## Convert the model to fp16:
-D:\llama.cpp>python convert.py D:\Mixtral --outtype f16 --outfile D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin
 ## Quantize the fp16 model to q5_k_m:
-D:\llama.cpp>quantize.exe D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.q5_k_m.gguf q5_k_m
 That's it. Load up the resulting .gguf file like you normally would.

 Extract the two .zip files directly into the llama.cpp folder you just git cloned. Overwrite files as prompted.
 ## Download Mixtral
+* Download the full-blast version of the model by downloading all .safetensors, .json, and .model files to D:\Mixtral\:
+* https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
 ## Convert the model to fp16:
+* D:\llama.cpp>python convert.py D:\Mixtral --outtype f16 --outfile D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin
 ## Quantize the fp16 model to q5_k_m:
+* D:\llama.cpp>quantize.exe D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.q5_k_m.gguf q5_k_m
 That's it. Load up the resulting .gguf file like you normally would.