Update README.md
Browse files
README.md
CHANGED
|
@@ -15,4 +15,16 @@ pipeline_tag: text-classification
|
|
| 15 |
# GuardReasoner 8B
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) via R-SFT and HS-DPO.
|
| 18 |
-
This model is based on the paper [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://huggingface.co/papers/2501.18492).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
# GuardReasoner 8B
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) via R-SFT and HS-DPO.
|
| 18 |
+
This model is based on the paper [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://huggingface.co/papers/2501.18492).
|
| 19 |
+
|
| 20 |
+
The training data of R-SFT can be found in [GuardReasonerTrain](https://huggingface.co/datasets/yueliu1999/GuardReasonerTrain).
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
```
|
| 24 |
+
@article{GuardReasoner,
|
| 25 |
+
title={GuardReasoner: Towards Reasoning-based LLM Safeguards},
|
| 26 |
+
author={Liu, Yue and Gao, Hongcheng and Zhai, Shengfang and Jun, Xia and Wu, Tianyi and Xue, Zhiwei and Chen, Yulin and Kawaguchi, Kenji and Zhang, Jiaheng and Hooi, Bryan},
|
| 27 |
+
journal={arXiv preprint arXiv:2501.18492},
|
| 28 |
+
year={2025}
|
| 29 |
+
}
|
| 30 |
+
```
|