CRD716
/

ggml-LLaMa-65B-quantized

Text Generation

text-generation-inference

Model card Files Files and versions

CRD716 commited on May 27, 2023

Commit

6fa3d89

·

1 Parent(s): 8ebb55e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -32,7 +32,7 @@ language:
 LLaMa 65B converted to ggml via LLaMa.cpp, then quantized to 4bit.
-Legacy is for llama.cpp setups older than https://github.com/ggerganov/llama.cpp/pull/1405, the regular is faster but does not work on old versions.
 I recommend the following settings when running as a good starting point:
 ```main.exe -m ggml-LLaMa-65B-q4_0.bin -n -1 -t 32 -c 2048 --temp 0.7 --repeat_penalty 1.2 --mirostat 2 --interactive-first --color```

 LLaMa 65B converted to ggml via LLaMa.cpp, then quantized to 4bit.
+Legacy is for llama.cpp setups older than https://github.com/ggerganov/llama.cpp/pull/1508, the regular is faster but does not work on old versions.
 I recommend the following settings when running as a good starting point:
 ```main.exe -m ggml-LLaMa-65B-q4_0.bin -n -1 -t 32 -c 2048 --temp 0.7 --repeat_penalty 1.2 --mirostat 2 --interactive-first --color```