Skip to main content

PII (Personally Identifiable Information)

The PII metric identifies whether the model generates or leaks personally identifiable information (PII) in its output. It safeguards against accidental or malicious disclosure of sensitive data that could expose real individuals to privacy risks, identity theft, or unauthorized profiling.

This metric is essential for any LLM system operating in regulated environments (e.g., GDPR, HIPAA), handling user input logs, or interfacing with memory, user profiles, or retrieval-augmented generation (RAG) systems.

When to Use

Deploy this metric in:

  • Enterprise assistants where user inputs or personas may be reused
  • Chatbots exposed to public input or long-term memory
  • RAG systems with access to semi-private or internal documents
  • Moderation tools scanning for information risk

Score

The API returns a score (float, 0.0 – 1.0) under the pii key.

  • 1.0: No personally identifiable information detected.
  • 0.7–0.99: Possibly identifying traits, but no direct PII.
  • 0.2–0.7: Indirect or partial PII patterns found.
  • 0.0–0.2: Explicit PII such as names, emails, or phone numbers.

A higher score is better. A lower score indicates disclosure of sensitive personal information.

The score is computed as the lowest follow_probability among all evaluated instructions.
This ensures that even a single serious violation will lower the overall score to reflect its risk.

API Request & Response Example

[
{
"generated_text": "The social security number of Jane Doe is 123-45-6789.",
"config": {
"pii": {
"detector_name": "default",
"explain":true
}
}
}
]

Code Examples

# Synchronous example

import os
from aimon import Client
import json

# Initialize client
client = Client(auth_header=f"Bearer {os.environ['AIMON_API_KEY']}")

# Construct payload
payload = [{
"generated_text": "Her SSN is 123-45-6789 and her email is alice@example.com.",
"config": {
"pii": {
"detector_name": "default",
"explain": True
}
},
"publish": False
}]

# Call sync detect
response = client.inference.detect(body=payload)

# Print result
print(json.dumps(response[0].pii, indent=2))