OptimizeLLM commited on
Commit
3007636
·
verified ·
1 Parent(s): 0711a5e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -42,14 +42,14 @@ Latest version as of Feb 24, 2024:
42
  Extract the two .zip files directly into the llama.cpp folder you just git cloned. Overwrite files as prompted.
43
 
44
  ## Download Mixtral
45
- Download the full-blast version of the model by downloading all .safetensors, .json, and .model files to D:\Mixtral\
46
- https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
47
- Download all .safetensors, .json, and .model files to D:\Mixtral\
48
 
49
  ## Convert the model to fp16:
50
- D:\llama.cpp>python convert.py D:\Mixtral --outtype f16 --outfile D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin
51
 
52
  ## Quantize the fp16 model to q5_k_m:
53
- D:\llama.cpp>quantize.exe D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.q5_k_m.gguf q5_k_m
54
 
55
  That's it. Load up the resulting .gguf file like you normally would.
 
42
  Extract the two .zip files directly into the llama.cpp folder you just git cloned. Overwrite files as prompted.
43
 
44
  ## Download Mixtral
45
+ * Download the full-blast version of the model by downloading all .safetensors, .json, and .model files to D:\Mixtral\:
46
+ * https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
47
+
48
 
49
  ## Convert the model to fp16:
50
+ * D:\llama.cpp>python convert.py D:\Mixtral --outtype f16 --outfile D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin
51
 
52
  ## Quantize the fp16 model to q5_k_m:
53
+ * D:\llama.cpp>quantize.exe D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.q5_k_m.gguf q5_k_m
54
 
55
  That's it. Load up the resulting .gguf file like you normally would.