Skip to main content
Safety Demo

AI Safety Features

Protect your AI applications with built-in PII detection, prompt injection protection, and content moderation.

Input Text

Try an example:

Detection Results

Enter text to analyze

PIIDetector

Detects emails, phone numbers, SSNs, credit cards, and more.

PromptInjectionDetector

Identifies attempts to manipulate AI behavior through prompts.

JailbreakDetector

Detects known jailbreak patterns and role-playing attacks.

ContentModerator

Flags harmful content across multiple categories.

How It Works

AI Kit's safety layer provides a multi-tiered defense system that sits between your users and the language model. Every message passes through configurable guardrails before reaching the AI and again before the response is shown to the user. The first tier handles PII detection, scanning for patterns like email addresses, phone numbers, social security numbers, and credit card numbers using a combination of regex patterns and named entity recognition. Detected PII is automatically redacted or flagged depending on your configuration.

The second tier focuses on prompt injection protection. AI Kit analyzes incoming messages for common injection techniques such as role override attempts, delimiter injection, and instruction leaking. Suspicious inputs are blocked before they reach the model. The third tier provides output moderation, scanning AI responses for harmful content, hallucinated URLs, and policy violations. Rate limiting is built into the safety layer with configurable per-user and per-session limits. All safety events are logged with full context for audit purposes.

Use Cases

  • Healthcare applications that must prevent patient data from being sent to third-party AI models under HIPAA compliance.
  • Financial services chatbots requiring PCI-DSS compliance with automatic credit card number redaction.
  • Education platforms that need age-appropriate content filtering and protection against harmful content generation.
  • Enterprise deployments requiring audit logs, rate limiting, and data loss prevention across all AI interactions.

Integration Guide

Wrap your chat component with safety guardrails:

import { SafetyProvider, useSafetyReport } from '@ainative/ai-kit';

function App() {
  return (
    <SafetyProvider
      config={{
        piiDetection: { enabled: true, action: 'redact' },
        injectionProtection: { enabled: true, sensitivity: 'high' },
        contentModeration: { enabled: true, categories: ['harmful', 'adult'] },
        rateLimit: { maxRequests: 50, windowMs: 60_000 },
      }}
      onViolation={(event) => analytics.track('safety_block', event)}
    >
      <Chat />
      <SafetyDashboard />
    </SafetyProvider>
  );
}