Update README.md
Browse filesadd the early stop parameter
README.md
CHANGED
|
@@ -13,7 +13,6 @@ tags:
|
|
| 13 |
- gguf
|
| 14 |
- llama cpp
|
| 15 |
---
|
| 16 |
-
|
| 17 |
# Octopus V4-GGUF: Graph of language models
|
| 18 |
|
| 19 |
|
|
@@ -32,22 +31,63 @@ tags:
|
|
| 32 |
**Acknowledgement**:
|
| 33 |
We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
|
| 34 |
|
|
|
|
| 35 |
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
```bash
|
| 39 |
-
|
| 40 |
```
|
| 41 |
|
| 42 |
-
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
-
|
| 45 |
-
|
|
|
|
|
|
|
|
|
|
| 46 |
|
| 47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
```
|
| 50 |
-
Note that `<nexa_4>` represents the math gpt.
|
| 51 |
|
| 52 |
### Dataset and Benchmark
|
| 53 |
|
|
|
|
| 13 |
- gguf
|
| 14 |
- llama cpp
|
| 15 |
---
|
|
|
|
| 16 |
# Octopus V4-GGUF: Graph of language models
|
| 17 |
|
| 18 |
|
|
|
|
| 31 |
**Acknowledgement**:
|
| 32 |
We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
|
| 33 |
|
| 34 |
+
## (Recommended) Run with [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
| 35 |
|
| 36 |
+
1. **Clone and compile:**
|
| 37 |
+
|
| 38 |
+
```bash
|
| 39 |
+
git clone https://github.com/ggerganov/llama.cpp
|
| 40 |
+
cd llama.cpp
|
| 41 |
+
# Compile the source code:
|
| 42 |
+
make
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
2. **Prepare the Input Prompt File:**
|
| 46 |
+
|
| 47 |
+
Navigate to the `prompt` folder inside the `llama.cpp`, and create a new file named `chat-with-octopus.txt`.
|
| 48 |
+
|
| 49 |
+
`chat-with-octopus.txt`:
|
| 50 |
+
|
| 51 |
+
```bash
|
| 52 |
+
User:
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
3. **Execute the Model:**
|
| 56 |
+
|
| 57 |
+
Run the following command in the terminal:
|
| 58 |
|
| 59 |
```bash
|
| 60 |
+
./main -m ./path/to/octopus-v4-Q4_K_M.gguf -c 512 -b 2048 -n 256 -t 1 --repeat_penalty 1.0 --top_k 0 --top_p 1.0 --color -i -r "User:" -f prompts/chat-with-octopus.txt
|
| 61 |
```
|
| 62 |
|
| 63 |
+
Example prompt to interact
|
| 64 |
+
```bash
|
| 65 |
+
<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>
|
| 66 |
+
```
|
| 67 |
|
| 68 |
+
## Run with [Ollama](https://github.com/ollama/ollama)
|
| 69 |
+
1. Create a `Modelfile` in your directory and include a `FROM` statement with the path to your local model:
|
| 70 |
+
```bash
|
| 71 |
+
FROM ./path/to/octopus-v4-Q4_K_M.gguf
|
| 72 |
+
```
|
| 73 |
|
| 74 |
+
2. Use the following command to add the model to Ollama:
|
| 75 |
+
```bash
|
| 76 |
+
ollama create octopus-v4-Q4_K_M -f Modelfile
|
| 77 |
+
PARAMETER temperature 0
|
| 78 |
+
PARAMETER num_ctx 1024
|
| 79 |
+
PARAMETER stop <nexa_end>
|
| 80 |
+
```
|
| 81 |
|
| 82 |
+
3. Verify that the model has been successfully imported:
|
| 83 |
+
```bash
|
| 84 |
+
ollama ls
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
### Run the model
|
| 88 |
+
```bash
|
| 89 |
+
ollama run octopus-v4-Q4_K_M "<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>"
|
| 90 |
```
|
|
|
|
| 91 |
|
| 92 |
### Dataset and Benchmark
|
| 93 |
|