NexaAI
/

Squid

@@ -46,14 +46,14 @@ Squid employs a decoder-decoder framework with two main components:
 download this repository and run the following commands:
 ```bash
 git lfs install
-git clone https://huggingface.co/NexaAIDev/Dolphin
 python inference_example.py
 ```
 ### Method 2
-Install `nexaai-dolphin` package
 ```
-pip install nexaai-dolphin
 ```
 Then run the following commands:
@@ -61,8 +61,8 @@ Then run the following commands:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
 import torch
-from dolphin.configuration_dolphin import DolphinConfig
-from dolphin.modeling_dolphin import DolphinForCausalLM
 def inference_instruct(mycontext, question, device="cuda:0"):
@@ -106,8 +106,8 @@ def inference_instruct(mycontext, question, device="cuda:0"):
 if __name__ == "__main__":
     device_name = "cuda:0" if torch.cuda.is_available() else "cpu"
-    AutoConfig.register("dolphin", DolphinConfig)
-    AutoModelForCausalLM.register(DolphinConfig, DolphinForCausalLM)
     tokenizer = AutoTokenizer.from_pretrained('NexaAIDev/Squid')
     model = AutoModelForCausalLM.from_pretrained('NexaAIDev/Squid', trust_remote_code=True, torch_dtype=torch.bfloat16, device_map=device_name)
@@ -119,7 +119,7 @@ if __name__ == "__main__":
 ```
 ## Training Process
-Dolphin's training involves three stages:
 1. Restoration Training: Reconstructing original context from compressed embeddings
 2. Continual Training: Generating context continuations from partial compressed contexts
 3. Instruction Fine-tuning: Generating responses to queries given compressed contexts
@@ -127,10 +127,10 @@ Dolphin's training involves three stages:
 This multi-stage approach progressively enhances the model's ability to handle long contexts and generate appropriate responses.
 ## Citation
-If you use Dolphin in your research, please cite our paper:
 ```bibtex
-@article{chen2024dolphinlongcontextnew,
       title={Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models},
       author={Wei Chen and Zhiyuan Li and Shuo Xin and Yihao Wang},
       year={2024},

 download this repository and run the following commands:
 ```bash
 git lfs install
+git clone https://huggingface.co/NexaAIDev/Squid
 python inference_example.py
 ```
 ### Method 2
+Install `nexaai-squid` package
 ```
+pip install nexaai-squid
 ```
 Then run the following commands:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
 import torch
+from squid.configuration_squid import SquidConfig
+from squid.modeling_squid import SquidForCausalLM
 def inference_instruct(mycontext, question, device="cuda:0"):
 if __name__ == "__main__":
     device_name = "cuda:0" if torch.cuda.is_available() else "cpu"
+    AutoConfig.register("squid", SquidConfig)
+    AutoModelForCausalLM.register(SquidConfig, SquidForCausalLM)
     tokenizer = AutoTokenizer.from_pretrained('NexaAIDev/Squid')
     model = AutoModelForCausalLM.from_pretrained('NexaAIDev/Squid', trust_remote_code=True, torch_dtype=torch.bfloat16, device_map=device_name)
 ```
 ## Training Process
+Squid's training involves three stages:
 1. Restoration Training: Reconstructing original context from compressed embeddings
 2. Continual Training: Generating context continuations from partial compressed contexts
 3. Instruction Fine-tuning: Generating responses to queries given compressed contexts
 This multi-stage approach progressively enhances the model's ability to handle long contexts and generate appropriate responses.
 ## Citation
+If you use Squid in your research, please cite our paper:
 ```bibtex
+@article{chen2024squidlongcontextnew,
       title={Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models},
       author={Wei Chen and Zhiyuan Li and Shuo Xin and Yihao Wang},
       year={2024},