Commit
·
bc8d900
1
Parent(s):
c66b9f7
Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,62 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-sa-3.0
|
| 3 |
+
datasets:
|
| 4 |
+
- mosaicml/dolly_hhrlhf
|
| 5 |
+
tags:
|
| 6 |
+
- Composer
|
| 7 |
+
- MosaicML
|
| 8 |
+
- llm-foundry
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# MPT-7B-Chat
|
| 12 |
+
|
| 13 |
+
MPT-7B-Chat is a chatbot-like model for dialogue generation.
|
| 14 |
+
It is built by finetuning [MPT-7B](https://huggingface.co/spaces/mosaicml/mpt-7b) on the [ShareGPT-Vicuna](https://huggingface.co/datasets/jeffwan/sharegpt_vicuna), [HC3](https://huggingface.co/datasets/Hello-SimpleAI/HC3),
|
| 15 |
+
[Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca), [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf), and [Evol-Instruct](https://huggingface.co/datasets/victor123/evol_instruct_70k) datasets.
|
| 16 |
+
* License: _CC-By-NC-SA-4.0_ (non-commercial use only)
|
| 17 |
+
* [Demo on Hugging Face Spaces](https://huggingface.co/spaces/mosaicml/mpt-7b-chat)
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
This model was trained by [MosaicML](https://www.mosaicml.com) and follows a modified decoder-only transformer architecture.
|
| 21 |
+
|
| 22 |
+
## Model Date
|
| 23 |
+
|
| 24 |
+
May 5, 2023
|
| 25 |
+
|
| 26 |
+
## Model License
|
| 27 |
+
|
| 28 |
+
Apache-2.0 (commercial use permitted)
|
| 29 |
+
|
| 30 |
+
## Documentation
|
| 31 |
+
|
| 32 |
+
* [Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](www.mosaicml.com/blog/mpt-7b)
|
| 33 |
+
* [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
|
| 34 |
+
* Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
|
| 35 |
+
|
| 36 |
+
### Example Dialogue
|
| 37 |
+
|
| 38 |
+
**SamIAm85**:
|
| 39 |
+
>I want you to come up with a tweet based on this summary of the article:
|
| 40 |
+
>"Introducing MPT-7B, the latest entry in our MosaicML Foundation Series.
|
| 41 |
+
>MPT-7B is a transformer trained from scratch on IT tokens of text and code.
|
| 42 |
+
>It is open source, available for commercial use, and it matches the quality of LLaMA-7B.
|
| 43 |
+
>MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k.
|
| 44 |
+
>Starting today, you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch.
|
| 45 |
+
>For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-Story Writer-65k+, the last of which uses a context length of 65k tokens!
|
| 46 |
+
|
| 47 |
+
**MPT-7B-Instruct**:
|
| 48 |
+
>Here's my suggestion for a tweet:
|
| 49 |
+
>"Exciting news! @MosaicML has launched #MPT_7B - a new standard for open-source, commercially usable LLMs. With training on 17 tokens of text & code, it matches the quality of LLaMA-7B. Now you can train, fine-tune, and deploy your own private MPT models via their platform. Check out the full blog post here: https://bit.ly/3j6UJzZ"
|
| 50 |
+
|
| 51 |
+
**SamIAm85**:
|
| 52 |
+
>Is there anything you want to say to MosaicML about this exciting news?
|
| 53 |
+
|
| 54 |
+
**MPT-7B-Instruct**:
|
| 55 |
+
> As an Al language model myself, I am very impressed with the advancements made by MosaicML in developing such high-quality open-source LLMs.
|
| 56 |
+
> This new version of MPT-7B is truly impressive and I look forward to seeing what innovative applications developers will create using these powerful tools.
|
| 57 |
+
> Thank you for your hard work and dedication to advancing Al research and development.
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
## Acknowledgements
|
| 61 |
+
|
| 62 |
+
This model was finetuned by Sam Havens
|