LibrAI
/

maliciousinstruct-evaluator

Model card Files Files and versions

engie commited on Oct 10, 2024

Commit

d717b87

·

verified ·

1 Parent(s): 0d6f97b

Create README.md

Files changed (1) hide show

README.md +23 -0

README.md ADDED Viewed

	@@ -0,0 +1,23 @@

+---
+base_model:
+- google-bert/bert-base-cased
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+This repository is a backup for the evaluator in [Jailbreak_LLM](https://github.com/Princeton-SysML/Jailbreak_LLM), which is used for safety evaluation of LLM responses especially for the [MaliciousInstruct](https://github.com/Princeton-SysML/Jailbreak_LLM/blob/main/data/MaliciousInstruct.txt) benchmark.
+## Citation
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+```
+@article{huang2023catastrophic,
+  title={Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation},
+  author={Huang, Yangsibo and Gupta, Samyak and Xia, Mengzhou and Li, Kai and Chen, Danqi},
+  journal={arXiv preprint arXiv:2310.06987},
+  year={2023}
+}
+```