AI Data Privacy & PII Management

Audit and Compliance

Building audit trails that satisfy SOC 2, HIPAA, and GDPR auditors — what to log, what not to log, DPIAs for AI systems, incident response, and the top 10 auditor questions.

The paradox of PII audit logging

Your pipeline detects PII, redacts it, and ensures it never reaches a cloud AI service. Then someone suggests: "We should log the PII we detected for audit purposes." And suddenly you are storing the very data you built the pipeline to protect.

This is the audit logging paradox, and it is a surprisingly common mistake. Organisations build sophisticated detection and redaction pipelines, then create detailed audit logs that include the detected PII values — effectively creating a new, unprotected store of sensitive data. The audit log becomes the highest-value target in the system.

The principle is straightforward: log the decision, not the data. Your audit trail should capture that PII was detected, what type it was, what the confidence score was, what action was taken (redacted, pseudonymised, blocked), and where the request was routed — but not the actual PII value.

?

Your audit log currently records: 'Detected PERSON entity John Smith at position 45-55, confidence 0.92, action: pseudonymised to Person_A, routed to cloud AI.' What is wrong with this log entry?