Arch-Guard-gpu / README.md
cotran2's picture
Create README.md
e3137e8 verified
|
raw
history blame
893 Bytes

katanemolabs/Arch-Guard-gpu

Overview

The Katanemo Arch-Guard collection is a collection state-of-the-art (SOTA) LLMs specifically designed for jailbreaking detection tasks. Definition: jailbreaking attempts are malicious prompts designed to alternate the intended behavior of the foundation LLM model of the application. They often violate the safety and security policies of the model.

Arch Guard is a classifier model fine-tuned based on the open source model Llama/prompt-guard-86M on an opensource corpus of jailbreaking attemps with an intention to improve the capability of detecting jailbreaks only.

In summary, the Katanemo Arch-Function collection demonstrates:

  • State-of-the-art performance in jailbreaking attempts detection
  • Optimized low-latency, low False Positive Rate, making it suitable for real-time, production environments, and best user experience.