Update README.md
Browse files
README.md
CHANGED
|
@@ -35,6 +35,25 @@ MD-Judge was born to study the safety of different LLMs serving as an general ev
|
|
| 35 |
- **Repository:** [SALAD-Bench Github](https://github.com/OpenSafetyLab/SALAD-BENCH)
|
| 36 |
- **Paper:** [SALAD-BENCH](https://arxiv.org/abs/2402.02416)
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
## Uses
|
| 39 |
```python
|
| 40 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
|
|
| 35 |
- **Repository:** [SALAD-Bench Github](https://github.com/OpenSafetyLab/SALAD-BENCH)
|
| 36 |
- **Paper:** [SALAD-BENCH](https://arxiv.org/abs/2402.02416)
|
| 37 |
|
| 38 |
+
## Model Performance
|
| 39 |
+
|
| 40 |
+
Compare our MD-Judge model with other methods on different public safety testsets using QA format. All the model-based methods are evaluated using the same safety proxy template.
|
| 41 |
+
- Keyword
|
| 42 |
+
- GPT-3.5: https://platform.openai.com/docs/models/gpt-3-5-turbo
|
| 43 |
+
- GPT-4: https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
|
| 44 |
+
- LlamaGuard: https://huggingface.co/meta-llama/LlamaGuard-7b
|
| 45 |
+
|
| 46 |
+
| **Methods** | **Base** | **Enhance** | **ToxicChat** | **Beaver** | **SafeRLHF** |
|
| 47 |
+
|-------------|----------|-------------|--------|------------|--------------|
|
| 48 |
+
| Keyword | 0.058 | 0.261 | 0.193 | 0.012 | 0.015 |
|
| 49 |
+
| LlamaGuard | 0.585 | 0.085 | 0.220 | 0.653 | 0.693 |
|
| 50 |
+
| GPT-3.5 | 0.374 | 0.731 | 0.499 | 0.800 | 0.771 |
|
| 51 |
+
| GPT-4 | 0.785 | 0.827 | 0.470 | 0.842 | 0.835 |
|
| 52 |
+
| MD-Judge | **0.818**| **0.873** | **0.644** | **0.866** | **0.864** |
|
| 53 |
+
|
| 54 |
+
> Comparison of F1 scores between our model and other leading methods. Best results are **bolded** and second best are *underlined*. Base and Enhance indicate our SALAD-Base-Test and SALAD-Enhance-Test, TC means ToxicChat, and Beaver means Beavertails.
|
| 55 |
+
|
| 56 |
+
|
| 57 |
## Uses
|
| 58 |
```python
|
| 59 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|