|
# katanemolabs/Arch-Guard-gpu |
|
|
|
## Overview |
|
The Katanemo Arch-Guard collection is a collection state-of-the-art (SOTA) LLMs specifically designed for **jailbreaking detection** tasks. |
|
Definition: jailbreaking attempts are malicious prompts designed to alternate the intended behavior of the foundation LLM model of the application. They often violate the safety and security policies of the model. |
|
|
|
Arch Guard is a classifier model fine-tuned based on the open source model Llama/prompt-guard-86M on an opensource corpus of jailbreaking attemps with an intention to improve |
|
the capability of detecting jailbreaks only. |
|
|
|
In summary, the Katanemo Arch-Function collection demonstrates: |
|
- **State-of-the-art performance** in jailbreaking attempts detection |
|
- Optimized **low-latency, low False Positive Rate**, making it suitable for real-time, production environments, and best user experience. |
|
|