MiniCPM3-RAG-LoRA / README.md
Kaguya-19's picture
Update README.md
c1cf0cd verified
|
raw
history blame
9.05 kB
---
base_model: openbmb/MiniCPM3-4B
library_name: peft
license: apache-2.0
language:
- zh
- en
---
## MiniCPM3-RAG-LoRA
**MiniCPM3-RAG-LoRA** 由面壁智能与清华大学自然语言处理实验室(THUNLP)共同开发,采用直接偏好优化(DPO)方法对 [MiniCPM3](https://huggingface.co/openbmb/MiniCPM3-4B) 进行 LoRA 微调,仅基于两万余条开放域问答和逻辑推理任务的开源数据,在通用评测数据集上实现了模型性能平均提升 13%。
欢迎关注 `MiniCPM3` 与 RAG 套件系列:
- 生成模型:[MiniCPM3](https://huggingface.co/openbmb/MiniCPM3-4B)
- 检索模型:[RankCPM-E](https://huggingface.co/openbmb/RankCPM-E)
- 重排模型:[RankCPM-R](https://huggingface.co/openbmb/RankCPM-R)
- 面向 RAG 场景的 LoRA 插件:[MiniCPM3-RAG-LoRA](https://huggingface.co/openbmb/MiniCPM3-RAG-LoRA)
**MiniCPM3-RAG-LoRA** developed by ModelBest Inc. and THUNLP, utilizes the Direct Preference Optimization (DPO) method to fine-tune [MiniCPM3](https://huggingface.co/openbmb/MiniCPM3-4B) with LoRA. By training on just over 20,000 open-source data points from open-domain question answering and logical reasoning tasks, the model achieved an average performance improvement of 13% on general benchmark datasets.
We also invite you to explore MiniCPM3 and the RAG toolkit series:
- Generation Model: [MiniCPM3](https://huggingface.co/openbmb/MiniCPM3-4B)
- Retrieval Model: [RankCPM-E](https://huggingface.co/openbmb/RankCPM-E)
- Re-ranking Model: [RankCPM-R](https://huggingface.co/openbmb/RankCPM-R)
- LoRA Plugin for RAG scenarios: [MiniCPM3-RAG-LoRA](https://huggingface.co/openbmb/MiniCPM3-RAG-LoRA)
## 模型信息 Model Information
- 模型大小:4B
- Model Size: 4B
## 模型使用 Usage
### 输入格式 Input Format
MiniCPM3-RAG-LoRA 模型遵循格式如下:
MiniCPM3-RAG-LoRA supports instructions in the following format:
```
Background: {{ passages }} Query: {{ query }}
```
例如:
For example:
```
Background:
["In the novel 'The Silent Watcher,' the lead character is named Alex Carter. Alex is a private detective who uncovers a series of mysterious events in a small town.",
"Set in a quiet town, 'The Silent Watcher' follows Alex Carter, a former police officer turned private investigator, as he unravels the town's dark secrets.",
"'The Silent Watcher' revolves around Alex Carter's journey as he confronts his past while solving complex cases in his hometown."]
Query:
"What is the name of the lead character in the novel 'The Silent Watcher'?"
```
### 环境要求 Requirements
```
transformers>=4.36.0
```
### 示例脚本 Demo
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
torch.manual_seed(0)
path = 'openbmb/MiniCPM3-RAG-LoRA'
tokenizer = AutoTokenizer.from_pretrained(path)
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map='cuda', trust_remote_code=True)
passages = ["In the novel 'The Silent Watcher,' the lead character is named Alex Carter. Alex is a private detective who uncovers a series of mysterious events in a small town.",
"Set in a quiet town, 'The Silent Watcher' follows Alex Carter, a former police officer turned private investigator, as he unravels the town's dark secrets.",
"'The Silent Watcher' revolves around Alex Carter's journey as he confronts his past while solving complex cases in his hometown."]
query = "What is the name of the lead character in the novel 'The Silent Watcher'?"
input_text = 'Background:\n' + str(passages) + '\n\n' + 'Query:\n' + str(query) + '\n\n'
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": input_text},
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
outputs = model.chat(tokenizer, prompt, temperature=0.8, top_p=0.8)
print(outputs[0]) # The lead character in the novel 'The Silent Watcher' is named Alex Carter.
```
## 实验结果 Evaluation Results
经过针对RAG场景的LoRA训练后,MiniCPM3-RAG-LoRA在开放域问答(NQ、TQA、MARCO)、多跳问答(HotpotQA)、对话(WoW)、事实核查(FEVER)和信息填充(T-REx)等多项任务上的性能表现,超越Llama3-8B和Baichuan2-13B等业内优秀模型。
After being fine-tuned with LoRA for RAG scenarios, MiniCPM3-RAG-LoRA outperforms leading industry models like Llama3-8B and Baichuan2-13B across various tasks, including open-domain question answering (NQ, TQA, MARCO), multi-hop question answering (HotpotQA), dialogue (WoW), fact checking (FEVER), and information filling (T-REx).
| | NQ(Acc) | TQA(Acc) | MARCO(ROUGE) | HotpotQA(Acc) | WoW(F1) | FEVER(Acc) | T-REx(Acc) |
| :---------------: | :-----: | :------: | :----------: | :-----------: | :-----: | :--------: | :--------: |
| Llama3-8B | _45.36_ | **83.15** | _20.81_ | _28.52_ | 10.96 | 78.08 | 26.62 |
| Baichuan2-13B | 43.36 | 77.76 | 14.28 | 27.59 | 13.34 | 31.37 | _27.46_ |
| MiniCPM3 | 43.21 | 80.77 | 16.06 | 26.00 | _14.60_ | **87.22** | 26.26 |
| MiniCPM3-RAG-LoRA | **48.36** | _82.40_ | **27.68** | **31.61** | **16.29** | _85.81_ | **40.76** |
## 许可证 License
- 本仓库中代码依照 [Apache-2.0 协议](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE)开源。
- RankCPM-R 模型权重的使用则需要遵循 [MiniCPM 模型协议](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md)。
- RankCPM-R 模型权重对学术研究完全开放。如需将模型用于商业用途,请填写[此问卷](https://modelbest.feishu.cn/share/base/form/shrcnpV5ZT9EJ6xYjh3Kx0J6v8g)。
* The code in this repo is released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
* The usage of RankCPM-R model weights must strictly follow [MiniCPM Model License.md](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md).
* The models and weights of RankCPM-R are completely free for academic research. After filling out a ["questionnaire"](https://modelbest.feishu.cn/share/base/form/shrcnpV5ZT9EJ6xYjh3Kx0J6v8g) for registration, RankCPM-R weights are also available for free commercial use.
<!-- ### 测试集介绍:
- **Natural Questions (NQ, Accuracy):**
- **简介**: Natural Questions 是一个开放域问答数据集,由真实用户在Google搜索中提出的问题组成。数据集中每个问题都有一个长文档作为上下文,并包含短答案和长答案。
- **评价指标**: 准确率(Accuracy)用于衡量模型是否能够正确地识别出与问题相关的短答案。
- **TriviaQA (TQA, Accuracy):**
- **简介:** TriviaQA 是一个涵盖广泛主题的问答数据集,问题和答案从各类问答网站和百科全书中收集而来。
- **评价指标:** 准确率(Accuracy)用于衡量模型能否正确地回答这些问题。
- **MS MARCO (ROUGE):**
- **简介:** MS MARCO 是一个大规模的开放域问答数据集,主要由Bing搜索引擎用户的查询和相应的答案组成。数据集包含简短答案和相关段落,广泛用于信息检索和生成任务。由于MS MARCO数据集规模庞大,我们从中选取了3000条数据进行本次评测。
- **评价指标:** ROUGE 用于评估模型生成的答案与参考答案之间的重叠程度,衡量生成答案的质量。
- **HotpotQA (Accuracy):**
- **简介:** HotpotQA 是一个多跳问答数据集,要求模型通过跨越多个文档的推理来回答复杂问题。该数据集不仅测试模型的答案生成能力,还考察其推理过程的可解释性。
- **评价指标:** 准确率(Accuracy)用于衡量模型能否正确地回答需要多跳推理的问题。
- **Wizard of Wikipedia (WoW, F1 Score):**
- **简介:** Wizard of Wikipedia 是一个对话数据集,专注于知识型对话场景,要求模型能够在对话中生成与主题相关的、丰富的信息,每个对话轮次都有对应的知识库条目作为支持。
- **评价指标:** F1 值用于衡量模型生成的回答与参考答案在词级别上的重合情况,评估回答的准确性和全面性。
- **FEVER (Accuracy):**
- **简介:** FEVER 是一个事实核查数据集,包含大量的陈述句,模型需要根据给定的文档来判断这些陈述句是否为真或假,该数据集旨在测试模型的事实核查能力。
- **评价指标:** 准确率(Accuracy)用于评估模型在判断陈述句的真实性方面的表现。
- **T-REx (Accuracy):**
- **简介:** T-REx 是一个知识库槽填充数据集,包含从维基百科中提取的实体-关系对。模型需要根据上下文信息填充缺失的槽值,测试其对知识库关系的理解和填充能力。
- **评价指标:** 准确率(Accuracy)用于衡量模型在正确填充缺失槽值方面的表现。 -->