AI Safety Features
Protect your AI applications with built-in PII detection, prompt injection protection, and content moderation.
Input Text
Try an example:
Detection Results
Enter text to analyze
PIIDetector
Detects emails, phone numbers, SSNs, credit cards, and more.
PromptInjectionDetector
Identifies attempts to manipulate AI behavior through prompts.
JailbreakDetector
Detects known jailbreak patterns and role-playing attacks.
ContentModerator
Flags harmful content across multiple categories.
How It Works
AI Kit's safety layer provides a multi-tiered defense system that sits between your users and the language model. Every message passes through configurable guardrails before reaching the AI and again before the response is shown to the user. The first tier handles PII detection, scanning for patterns like email addresses, phone numbers, social security numbers, and credit card numbers using a combination of regex patterns and named entity recognition. Detected PII is automatically redacted or flagged depending on your configuration.
The second tier focuses on prompt injection protection. AI Kit analyzes incoming messages for common injection techniques such as role override attempts, delimiter injection, and instruction leaking. Suspicious inputs are blocked before they reach the model. The third tier provides output moderation, scanning AI responses for harmful content, hallucinated URLs, and policy violations. Rate limiting is built into the safety layer with configurable per-user and per-session limits. All safety events are logged with full context for audit purposes.
Use Cases
- Healthcare applications that must prevent patient data from being sent to third-party AI models under HIPAA compliance.
- Financial services chatbots requiring PCI-DSS compliance with automatic credit card number redaction.
- Education platforms that need age-appropriate content filtering and protection against harmful content generation.
- Enterprise deployments requiring audit logs, rate limiting, and data loss prevention across all AI interactions.
Integration Guide
Wrap your chat component with safety guardrails:
import { SafetyProvider, useSafetyReport } from '@ainative/ai-kit';
function App() {
return (
<SafetyProvider
config={{
piiDetection: { enabled: true, action: 'redact' },
injectionProtection: { enabled: true, sensitivity: 'high' },
contentModeration: { enabled: true, categories: ['harmful', 'adult'] },
rateLimit: { maxRequests: 50, windowMs: 60_000 },
}}
onViolation={(event) => analytics.track('safety_block', event)}
>
<Chat />
<SafetyDashboard />
</SafetyProvider>
);
}