Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -36,39 +36,38 @@ This article comprises the following sections:
|
|
| 36 |
|
| 37 |
## Inference
|
| 38 |
|
| 39 |
-
To deploy InternVL2, please configure the chat template config first. Create the following JSON file `chat_template.json`.
|
| 40 |
-
|
| 41 |
-
```json
|
| 42 |
-
{
|
| 43 |
-
"model_name":"internvl-internlm2",
|
| 44 |
-
"meta_instruction":"我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。",
|
| 45 |
-
"stop_words":["<|im_start|>", "<|im_end|>"]
|
| 46 |
-
}
|
| 47 |
-
```
|
| 48 |
-
|
| 49 |
Trying the following codes, you can perform the batched offline inference with the quantized model:
|
| 50 |
|
| 51 |
```python
|
| 52 |
-
from lmdeploy import pipeline
|
| 53 |
-
from lmdeploy.model import ChatTemplateConfig
|
| 54 |
-
from lmdeploy.messages import TurbomindEngineConfig
|
| 55 |
from lmdeploy.vl import load_image
|
| 56 |
|
| 57 |
model = 'OpenGVLab/InternVL2-2B-AWQ'
|
| 58 |
-
|
| 59 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
|
|
|
|
|
|
|
| 60 |
backend_config = TurbomindEngineConfig(model_format='awq')
|
| 61 |
pipe = pipeline(model, chat_template_config=chat_template_config,
|
| 62 |
-
backend_config=backend_config
|
| 63 |
-
log_level='INFO')
|
| 64 |
response = pipe(('describe this image', image))
|
| 65 |
-
print(response)
|
| 66 |
```
|
| 67 |
|
| 68 |
For more information about the pipeline parameters, please refer to [here](https://github.com/InternLM/lmdeploy/blob/main/docs/en/inference/pipeline.md).
|
| 69 |
|
| 70 |
## Service
|
| 71 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup.
|
| 73 |
|
| 74 |
```shell
|
|
@@ -77,6 +76,10 @@ lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-A
|
|
| 77 |
|
| 78 |
To use the OpenAI-style interface, you need to install OpenAI:
|
| 79 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
Then, use the code below to make the API call:
|
| 81 |
|
| 82 |
```python
|
|
|
|
| 36 |
|
| 37 |
## Inference
|
| 38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
Trying the following codes, you can perform the batched offline inference with the quantized model:
|
| 40 |
|
| 41 |
```python
|
| 42 |
+
from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
|
|
|
|
|
|
|
| 43 |
from lmdeploy.vl import load_image
|
| 44 |
|
| 45 |
model = 'OpenGVLab/InternVL2-2B-AWQ'
|
| 46 |
+
system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
|
| 47 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
|
| 48 |
+
chat_template_config = ChatTemplateConfig('internvl-internlm2')
|
| 49 |
+
chat_template_config.meta_instruction = system_prompt
|
| 50 |
backend_config = TurbomindEngineConfig(model_format='awq')
|
| 51 |
pipe = pipeline(model, chat_template_config=chat_template_config,
|
| 52 |
+
backend_config=backend_config))
|
|
|
|
| 53 |
response = pipe(('describe this image', image))
|
| 54 |
+
print(response.text)
|
| 55 |
```
|
| 56 |
|
| 57 |
For more information about the pipeline parameters, please refer to [here](https://github.com/InternLM/lmdeploy/blob/main/docs/en/inference/pipeline.md).
|
| 58 |
|
| 59 |
## Service
|
| 60 |
|
| 61 |
+
To deploy InternVL2 as an API, please configure the chat template config first. Create the following JSON file `chat_template.json`.
|
| 62 |
+
|
| 63 |
+
```json
|
| 64 |
+
{
|
| 65 |
+
"model_name":"internvl-internlm2",
|
| 66 |
+
"meta_instruction":"我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。",
|
| 67 |
+
"stop_words":["<|im_start|>", "<|im_end|>"]
|
| 68 |
+
}
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup.
|
| 72 |
|
| 73 |
```shell
|
|
|
|
| 76 |
|
| 77 |
To use the OpenAI-style interface, you need to install OpenAI:
|
| 78 |
|
| 79 |
+
```shell
|
| 80 |
+
pip install openai
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
Then, use the code below to make the API call:
|
| 84 |
|
| 85 |
```python
|