Update README.md
Browse files
README.md
CHANGED
|
@@ -92,7 +92,6 @@ OpenBioLLM-70B is an advanced open source language model designed specifically f
|
|
| 92 |
</div>
|
| 93 |
|
| 94 |
|
| 95 |
-
- **Reward Model**: [Nexusflow/Starling-RM-34B](https://huggingface.co/Nexusflow/Starling-RM-34B)
|
| 96 |
- **Policy Optimization**: [Fine-Tuning Language Models from Human Preferences (PPO)](https://arxiv.org/abs/1909.08593)
|
| 97 |
- **Ranking Dataset**: [berkeley-nest/Nectar](https://huggingface.co/datasets/berkeley-nest/Nectar)
|
| 98 |
- **Fine-tuning dataset**: Custom Medical Instruct dataset (We plan to release a sample training dataset in our upcoming paper; please stay updated)
|
|
@@ -106,7 +105,7 @@ This combination of cutting-edge techniques enables OpenBioLLM-70B to align with
|
|
| 106 |
- **Language(s) (NLP):** en
|
| 107 |
- **Developed By**: [Ankit Pal (Aaditya Ura)](https://aadityaura.github.io/) from Saama AI Labs
|
| 108 |
- **License:** Meta-Llama License
|
| 109 |
-
- **Fine-tuned from models:** [Meta-Llama-3-70B-Instruct](meta-llama/Meta-Llama-3-70B-Instruct)
|
| 110 |
- **Resources for more information:**
|
| 111 |
- Paper: Coming soon
|
| 112 |
|
|
|
|
| 92 |
</div>
|
| 93 |
|
| 94 |
|
|
|
|
| 95 |
- **Policy Optimization**: [Fine-Tuning Language Models from Human Preferences (PPO)](https://arxiv.org/abs/1909.08593)
|
| 96 |
- **Ranking Dataset**: [berkeley-nest/Nectar](https://huggingface.co/datasets/berkeley-nest/Nectar)
|
| 97 |
- **Fine-tuning dataset**: Custom Medical Instruct dataset (We plan to release a sample training dataset in our upcoming paper; please stay updated)
|
|
|
|
| 105 |
- **Language(s) (NLP):** en
|
| 106 |
- **Developed By**: [Ankit Pal (Aaditya Ura)](https://aadityaura.github.io/) from Saama AI Labs
|
| 107 |
- **License:** Meta-Llama License
|
| 108 |
+
- **Fine-tuned from models:** [Meta-Llama-3-70B-Instruct](meta-llama/Meta-Llama-3-70B-Instruct)
|
| 109 |
- **Resources for more information:**
|
| 110 |
- Paper: Coming soon
|
| 111 |
|