TheBloke
/

gpt4-alpaca-lora-30B-GPTQ

Text Generation

text2text-generation

text-generation-inference

4-bit precision

Model card Files Files and versions

TheBloke commited on Apr 17, 2023

Commit

1c82f5d

·

1 Parent(s): 27f8d3e

Update README.md

Files changed (1) hide show

README.md +7 -4

README.md CHANGED Viewed

@@ -42,14 +42,17 @@ Details of the files provided:
 ## How to run in `text-generation-webui`
-The `safetensors` model file was created with the latest GPTQ code, and uses `--act-order` to give the maximum possible quantisation quality. This means it requires that the latest GPTQ-for-LLaMa is used inside the UI.
 Here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
 ```
-git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa
 git clone https://github.com/oobabooga/text-generation-webui
 mkdir -p text-generation-webui/repositories
-ln -s GPTQ-for-LLaMa text-generation-webui/repositories/GPTQ-for-LLaMa
 ```
 Then install this model into `text-generation-webui/models` and launch the UI as follows:
@@ -60,7 +63,7 @@ python server.py --model gpt4-alpaca-lora-30B-GPTQ-4bit-128g --wbits 4 --groupsi
 The above commands assume you have installed all dependencies for GPTQ-for-LLaMa and text-generation-webui. Please see their respective repositories for further information.
-If you are on Windows, or cannot use the Triton branch of GPTQ for any other reason, you can instead use the CUDA branch:
 ```
 git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa -b cuda
 cd GPTQ-for-LLaMa

 ## How to run in `text-generation-webui`
+The `safetensors` model file was created with the GPTQ-for-LLaMa code as of April 13th, and uses `--act-order` to give the maximum possible quantisation quality. This means it requires that this same version of GPTQ-for-LLaMa is used inside the UI.
 Here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
 ```
+# Since April 14th we can't clone the latest GPTQ-for-LLaMa as it's in the middle of a refactoring
+git clone -n https://github.com/qwopqwop200/GPTQ-for-LLaMa gptq-working
+cd gptq-working && git checkout 58c8ab4c7aaccc50f507fd08cce941976affe5e0 # Later commits are currently broken due to ongoing refactoring
 git clone https://github.com/oobabooga/text-generation-webui
 mkdir -p text-generation-webui/repositories
+ln -s gptq-working text-generation-webui/repositories/GPTQ-for-LLaMa
 ```
 Then install this model into `text-generation-webui/models` and launch the UI as follows:
 The above commands assume you have installed all dependencies for GPTQ-for-LLaMa and text-generation-webui. Please see their respective repositories for further information.
+If you are on Windows, or cannot use the Triton branch of GPTQ for any other reason, you can instead try the CUDA branch:
 ```
 git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa -b cuda
 cd GPTQ-for-LLaMa