Update README.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ tags:
|
|
| 16 |
</div>
|
| 17 |
|
| 18 |
<p align="center">
|
| 19 |
-
<a href="
|
| 20 |
<a href="https://github.com/OpenBMB/OmniLMM/" target="_blank">OmniLMM 多模态模型 Multi-modal Model</a> |
|
| 21 |
<a href="https://luca.cn/" target="_blank">CPM-C 千亿模型试用 ~100B Model Trial </a>
|
| 22 |
</p>
|
|
@@ -51,9 +51,18 @@ We release all model parameters for research and limited commercial use. We also
|
|
| 51 |
- The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
|
| 52 |
- Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
|
| 53 |
|
|
|
|
| 54 |
|
|
|
|
| 55 |
|
| 56 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
- 受限于模型规模,模型可能出现幻觉性问题。其中由于DPO模型生成的回复内容更长,更容易出现幻觉。我们也将持续进行MiniCPM模型的迭代改进;
|
| 59 |
- 为了保证在学术研究用途上模型的通用性,我们未对模型进行任何身份认同训练。同时由于我们用ShareGPT开源语料作为部分训练数据,模型可能会输出类似GPT系列模型的身份认同信息;
|
|
@@ -130,8 +139,8 @@ print(responds)
|
|
| 130 |
|
| 131 |
## 工作引用 Citation
|
| 132 |
|
| 133 |
-
* 如果觉得MiniCPM有助于您的工作,请考虑引用下列[技术报告](
|
| 134 |
-
* Please cite our [techinical report]() if you find our work valuable.
|
| 135 |
|
| 136 |
```
|
| 137 |
@inproceedings{minicpm2024,
|
|
@@ -140,5 +149,3 @@ print(responds)
|
|
| 140 |
year={2024}
|
| 141 |
}
|
| 142 |
```
|
| 143 |
-
|
| 144 |
-
|
|
|
|
| 16 |
</div>
|
| 17 |
|
| 18 |
<p align="center">
|
| 19 |
+
<a href="https://shengdinghu.notion.site/MiniCPM-c805a17c5c8046398914e47f0542095a?pvs=4" target="_blank">MiniCPM 技术报告</a><a href="https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20?pvs=4" target="_blank"> Technical Report</a> |
|
| 20 |
<a href="https://github.com/OpenBMB/OmniLMM/" target="_blank">OmniLMM 多模态模型 Multi-modal Model</a> |
|
| 21 |
<a href="https://luca.cn/" target="_blank">CPM-C 千亿模型试用 ~100B Model Trial </a>
|
| 22 |
</p>
|
|
|
|
| 51 |
- The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
|
| 52 |
- Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
|
| 53 |
|
| 54 |
+
### 评测结果 Evaluation Results
|
| 55 |
|
| 56 |
+
详细的评测结果位于[github仓库](https://github.com/OpenBMB/MiniCPM?tab=readme-ov-file#%E8%AF%84%E6%B5%8B%E7%BB%93%E6%9E%9C)
|
| 57 |
|
| 58 |
+
Detailed evaluation results are in [github repo](https://github.com/OpenBMB/MiniCPM/blob/main/README-en.md#evaluation-results)
|
| 59 |
+
|
| 60 |
+
注意:我们发现使用Huggingface生成质量略差于vLLM,因此推荐使用vLLM进行测试。我们正在排查原因。
|
| 61 |
+
|
| 62 |
+
Notice: We discovered that the quality of Huggingface generation is slightly lower than vLLM, thus benchmarking using vLLM is recommended.
|
| 63 |
+
We are investigating the cause now.
|
| 64 |
+
|
| 65 |
+
### 局限性 Limitations
|
| 66 |
|
| 67 |
- 受限于模型规模,模型可能出现幻觉性问题。其中由于DPO模型生成的回复内容更长,更容易出现幻觉。我们也将持续进行MiniCPM模型的迭代改进;
|
| 68 |
- 为了保证在学术研究用途上模型的通用性,我们未对模型进行任何身份认同训练。同时由于我们用ShareGPT开源语料作为部分训练数据,模型可能会输出类似GPT系列模型的身份认同信息;
|
|
|
|
| 139 |
|
| 140 |
## 工作引用 Citation
|
| 141 |
|
| 142 |
+
* 如果觉得MiniCPM有助于您的工作,请考虑引用下列[技术报告](https://shengdinghu.notion.site/MiniCPM-c805a17c5c8046398914e47f0542095a?pvs=4)
|
| 143 |
+
* Please cite our [techinical report](https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20?pvs=4) if you find our work valuable.
|
| 144 |
|
| 145 |
```
|
| 146 |
@inproceedings{minicpm2024,
|
|
|
|
| 149 |
year={2024}
|
| 150 |
}
|
| 151 |
```
|
|
|
|
|
|