Llama Guard 3
Meta · July 2024
● activeOpen Weightdecoder onlytext
Parameters8B
Description
Meta's purpose-built safety classifier based on LLaMA 3.1, designed to detect harmful content in both user prompts and model responses. Classifies inputs across safety categories like violence, hate speech, and sexual content, enabling developers to build guardrails into their AI applications.
Key Innovations
safety-classifier