Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,47 @@
|
|
1 |
-
---
|
2 |
-
license:
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
language:
|
4 |
+
- ja
|
5 |
+
base_model:
|
6 |
+
- tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3
|
7 |
+
- Qwen/Qwen2.5-VL-7B-Instruct
|
8 |
+
pipeline_tag: visual-question-answering
|
9 |
+
|
10 |
+
---
|
11 |
+
|
12 |
+
# Llama-3.1-70B-Instruct-multimodal-JP-Graph - Built with Llama
|
13 |
+
|
14 |
+
Llama-3.1-70B-Instruct-multimodal-JP-Graph is a Japanese Large Vision Language Model.
|
15 |
+
This model is based on [Llama-3.1-Swallow-70B](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3) and Image Encoder of [Qwen2-VL-7B](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct).
|
16 |
+
|
17 |
+
# How to use
|
18 |
+
### 1. Install LLaVA-NeXT
|
19 |
+
|
20 |
+
- First, please install LLaVA-NeXT by following the instructions at the [URL](https://github.com/LLaVA-VL/LLaVA-NeXT).
|
21 |
+
|
22 |
+
```sh
|
23 |
+
git clone https://github.com/LLaVA-VL/LLaVA-NeXT
|
24 |
+
cd LLaVA-NeXT
|
25 |
+
conda create -n llava python=3.10 -y
|
26 |
+
conda activate llava
|
27 |
+
pip install --upgrade pip # Enable PEP 660 support.
|
28 |
+
pip install -e ".[train]"
|
29 |
+
```
|
30 |
+
|
31 |
+
### 2. Install dependencies
|
32 |
+
```sh
|
33 |
+
pip install flash-attn==2.6.3
|
34 |
+
pip install transformers==4.45.2
|
35 |
+
```
|
36 |
+
|
37 |
+
### 3. Modify LLaVA-NeXT
|
38 |
+
- Modify the LLaVA-NeXT code as follows.
|
39 |
+
- Create the LLaVA-NeXT/llava/model/multimodal_encoder/qwen2_vl directory and copy the contents of the attached qwen2_vl directory into it.
|
40 |
+
- Overwrite LLaVA-NeXT/llava/model/multimodal_encoder/builder.py with the attached "builder.py".
|
41 |
+
- Copy the attached "qwen2vl_encoder.py" into LLaVA-NeXT/llava/model/multimodal_encoder/.
|
42 |
+
- Overwrite LLaVA-NeXT/llava/model/language_model/llava_llama.py with the attached "llava_llama.py".
|
43 |
+
- Overwrite LLaVA-NeXT/llava/model/llava_arch.py with the attached "llava_arch.py".
|
44 |
+
- Overwrite LLaVA-NeXT/llava/conversation.py with the attached "conversation.py".
|
45 |
+
|
46 |
+
### 4. Inference
|
47 |
+
The following script loads the model and allows inference.
|