Update README.md
Browse files
README.md
CHANGED
|
@@ -46,14 +46,14 @@ Squid employs a decoder-decoder framework with two main components:
|
|
| 46 |
download this repository and run the following commands:
|
| 47 |
```bash
|
| 48 |
git lfs install
|
| 49 |
-
git clone https://huggingface.co/NexaAIDev/
|
| 50 |
python inference_example.py
|
| 51 |
```
|
| 52 |
|
| 53 |
### Method 2
|
| 54 |
-
Install `nexaai-
|
| 55 |
```
|
| 56 |
-
pip install nexaai-
|
| 57 |
```
|
| 58 |
|
| 59 |
Then run the following commands:
|
|
@@ -61,8 +61,8 @@ Then run the following commands:
|
|
| 61 |
```python
|
| 62 |
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
|
| 63 |
import torch
|
| 64 |
-
from
|
| 65 |
-
from
|
| 66 |
|
| 67 |
|
| 68 |
def inference_instruct(mycontext, question, device="cuda:0"):
|
|
@@ -106,8 +106,8 @@ def inference_instruct(mycontext, question, device="cuda:0"):
|
|
| 106 |
|
| 107 |
if __name__ == "__main__":
|
| 108 |
device_name = "cuda:0" if torch.cuda.is_available() else "cpu"
|
| 109 |
-
AutoConfig.register("
|
| 110 |
-
AutoModelForCausalLM.register(
|
| 111 |
tokenizer = AutoTokenizer.from_pretrained('NexaAIDev/Squid')
|
| 112 |
model = AutoModelForCausalLM.from_pretrained('NexaAIDev/Squid', trust_remote_code=True, torch_dtype=torch.bfloat16, device_map=device_name)
|
| 113 |
|
|
@@ -119,7 +119,7 @@ if __name__ == "__main__":
|
|
| 119 |
```
|
| 120 |
|
| 121 |
## Training Process
|
| 122 |
-
|
| 123 |
1. Restoration Training: Reconstructing original context from compressed embeddings
|
| 124 |
2. Continual Training: Generating context continuations from partial compressed contexts
|
| 125 |
3. Instruction Fine-tuning: Generating responses to queries given compressed contexts
|
|
@@ -127,10 +127,10 @@ Dolphin's training involves three stages:
|
|
| 127 |
This multi-stage approach progressively enhances the model's ability to handle long contexts and generate appropriate responses.
|
| 128 |
|
| 129 |
## Citation
|
| 130 |
-
If you use
|
| 131 |
|
| 132 |
```bibtex
|
| 133 |
-
@article{
|
| 134 |
title={Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models},
|
| 135 |
author={Wei Chen and Zhiyuan Li and Shuo Xin and Yihao Wang},
|
| 136 |
year={2024},
|
|
|
|
| 46 |
download this repository and run the following commands:
|
| 47 |
```bash
|
| 48 |
git lfs install
|
| 49 |
+
git clone https://huggingface.co/NexaAIDev/Squid
|
| 50 |
python inference_example.py
|
| 51 |
```
|
| 52 |
|
| 53 |
### Method 2
|
| 54 |
+
Install `nexaai-squid` package
|
| 55 |
```
|
| 56 |
+
pip install nexaai-squid
|
| 57 |
```
|
| 58 |
|
| 59 |
Then run the following commands:
|
|
|
|
| 61 |
```python
|
| 62 |
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
|
| 63 |
import torch
|
| 64 |
+
from squid.configuration_squid import SquidConfig
|
| 65 |
+
from squid.modeling_squid import SquidForCausalLM
|
| 66 |
|
| 67 |
|
| 68 |
def inference_instruct(mycontext, question, device="cuda:0"):
|
|
|
|
| 106 |
|
| 107 |
if __name__ == "__main__":
|
| 108 |
device_name = "cuda:0" if torch.cuda.is_available() else "cpu"
|
| 109 |
+
AutoConfig.register("squid", SquidConfig)
|
| 110 |
+
AutoModelForCausalLM.register(SquidConfig, SquidForCausalLM)
|
| 111 |
tokenizer = AutoTokenizer.from_pretrained('NexaAIDev/Squid')
|
| 112 |
model = AutoModelForCausalLM.from_pretrained('NexaAIDev/Squid', trust_remote_code=True, torch_dtype=torch.bfloat16, device_map=device_name)
|
| 113 |
|
|
|
|
| 119 |
```
|
| 120 |
|
| 121 |
## Training Process
|
| 122 |
+
Squid's training involves three stages:
|
| 123 |
1. Restoration Training: Reconstructing original context from compressed embeddings
|
| 124 |
2. Continual Training: Generating context continuations from partial compressed contexts
|
| 125 |
3. Instruction Fine-tuning: Generating responses to queries given compressed contexts
|
|
|
|
| 127 |
This multi-stage approach progressively enhances the model's ability to handle long contexts and generate appropriate responses.
|
| 128 |
|
| 129 |
## Citation
|
| 130 |
+
If you use Squid in your research, please cite our paper:
|
| 131 |
|
| 132 |
```bibtex
|
| 133 |
+
@article{chen2024squidlongcontextnew,
|
| 134 |
title={Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models},
|
| 135 |
author={Wei Chen and Zhiyuan Li and Shuo Xin and Yihao Wang},
|
| 136 |
year={2024},
|