NexaAI
/

octo-net-gguf

function calling

on-device language model

Model card Files Files and versions

zy6 commited on May 8, 2024

Commit

528c40a

·

verified ·

1 Parent(s): a6ca77e

Update README.md

add the early stop parameter

Files changed (1) hide show

README.md +48 -8

README.md CHANGED Viewed

@@ -13,7 +13,6 @@ tags:
   - gguf
   - llama cpp
 ---
 # Octopus V4-GGUF: Graph of language models
@@ -32,22 +31,63 @@ tags:
 **Acknowledgement**:
 We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
-## Run with [Ollama](https://github.com/ollama/ollama)
 ```bash
-ollama run NexaAIDev/octopus-v4-q4_k_m
 ```
-Input example:
-```json
-Query: Tell me the result of derivative of x^3 when x is 2?
-Response: <nexa_4> ('Determine the derivative of the function f(x) = x^3 at the point where x equals 2, and interpret the result within the context of rate of change and tangent slope.')<nexa_end>
 ```
-Note that `<nexa_4>` represents the math gpt.
 ### Dataset and Benchmark

   - gguf
   - llama cpp
 ---
 # Octopus V4-GGUF: Graph of language models
 **Acknowledgement**:
 We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
+## (Recommended) Run with [llama.cpp](https://github.com/ggerganov/llama.cpp)
+1. **Clone and compile:**
+```bash
+git clone https://github.com/ggerganov/llama.cpp
+cd llama.cpp
+# Compile the source code:
+make
+```
+2. **Prepare the Input Prompt File:**
+Navigate to the `prompt` folder inside the `llama.cpp`, and create a new file named `chat-with-octopus.txt`.
+   `chat-with-octopus.txt`:
+```bash
+User:
+```
+3. **Execute the Model:**
+Run the following command in the terminal:
 ```bash
+./main -m ./path/to/octopus-v4-Q4_K_M.gguf -c 512 -b 2048 -n 256 -t 1 --repeat_penalty 1.0 --top_k 0 --top_p 1.0 --color -i -r "User:" -f prompts/chat-with-octopus.txt
 ```
+Example prompt to interact
+```bash
+<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>
+```
+## Run with [Ollama](https://github.com/ollama/ollama)
+1. Create a `Modelfile` in your directory and include a `FROM` statement with the path to your local model:
+```bash
+FROM ./path/to/octopus-v4-Q4_K_M.gguf
+```
+2. Use the following command to add the model to Ollama:
+```bash
+ollama create octopus-v4-Q4_K_M -f Modelfile
+PARAMETER temperature 0
+PARAMETER num_ctx 1024
+PARAMETER stop <nexa_end>
+```
+3. Verify that the model has been successfully imported:
+```bash
+ollama ls
+```
+### Run the model
+```bash
+ollama run octopus-v4-Q4_K_M "<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>"
 ```
 ### Dataset and Benchmark