Threat modelling for enterprise AI, architecture patterns from fully local to hybrid with sanitisation, PII detection and redaction, audit trails, and compliance mapping for GDPR, HIPAA, ITAR, and SOC 2.
Start with the threat model
Before selecting a privacy architecture, you need to understand what you are protecting against. "Privacy" is not a single requirement -- it is a spectrum of threats, and different architectures defend against different threats.
Threat 1: Data exfiltration via AI provider. Your data is sent to a third party (the AI provider) who could store, log, use for training, or be compelled by law enforcement to disclose it. Even with contractual protections, the data was on their infrastructure.
Threat 2: Data exposure during transit. Data intercepted between your systems and the AI provider. TLS mitigates this for internet-facing APIs, but the risk increases in complex network topologies.
Threat 3: Unauthorised internal access. An employee accesses AI-processed data they should not have access to. The AI system becomes a side channel for accessing restricted information.
Threat 4: Model output leakage. The model's responses reveal information from its training data or from other users' queries (in multi-tenant systems). Prompt injection attacks could cause the model to disclose system prompts or context from other sessions.
Threat 5: Regulatory non-compliance. The AI processing itself creates a compliance violation, even if no data is actually misused. The act of sending data to external infrastructure may violate regulations regardless of what the recipient does with it.
Threat 6: Inference about individuals. The AI system infers sensitive attributes (health status, political views, financial situation) from seemingly innocuous data, creating derived data that is itself subject to privacy regulations.
Not every organisation faces all six threats. A marketing team using AI on public data faces mainly threat 3. A healthcare organisation processing patient records faces all six. Your architecture should be proportional to your actual threat profile.
?
A government intelligence agency wants to use AI for analysing classified signals intelligence. Which threats are most critical?
Four privacy architecture patterns
Each pattern trades off privacy, cost, capability, and operational complexity differently.
Pattern 1: Fully Local (Maximum Privacy)
User Device├── Local model (browser or native app)├── Local embeddings and vector index├── Local data storage└── No network connections for AI workloads
Privacy level: Maximum. Data never leaves the device.
Threats mitigated: All six. No external attack surface.
Capability: Limited to what runs on the device (2-12B models typically).
Use cases: Personal knowledge bases, field workers, classified environments, individual professional tools.
Limitations: No cross-user capabilities, limited model size, user is responsible for their own device security.
Pattern 2: On-Premises Centralised (High Privacy)
Enterprise Network├── vLLM cluster (GPU servers)├── Vector database (on-prem)├── Document ingestion pipeline├── Internal API gateway with auth└── No external AI API calls
Privacy level: High. Data stays within the enterprise network.
Capability: 27-70B models, full RAG pipelines, high throughput.
Use cases: Enterprise-wide AI service, regulated industries, organisations with existing data centre infrastructure.
Limitations: Capital expense for GPU hardware, requires infrastructure team to maintain.
Pattern 3: VPC-Isolated Cloud (Moderate Privacy)
Cloud VPC (your tenant, your network)├── GPU instances (cloud provider)├── Model served within your VPC├── Data encrypted at rest and in transit├── No data leaves your VPC└── Cloud provider has physical access to hardware
Privacy level: Moderate. Data stays in your cloud tenant but exists on shared physical infrastructure.
Capability: Equivalent to on-premises, with cloud scalability.
Use cases: Organisations without data centres, moderate regulatory requirements, need for elastic scaling.
Limitations: Cloud provider has theoretical physical access. Regulatory interpretation varies on whether VPC isolation satisfies data residency requirements.
Pattern 4: Hybrid with Sanitisation (Pragmatic Privacy)
On-Premises / Edge├── Local model handles most queries├── PII gateway sanitises queries that need cloud escalation└── Sanitised queries → Cloud API └── Cloud response → PII restoration → User
Capability: Edge quality for most queries, frontier quality for complex ones.
Use cases: Organisations that need cloud quality for some tasks but must protect PII.
Limitations: PII detection is imperfect. Sanitisation may remove context needed for the best answer. Regulatory acceptance varies.
?
A law firm handles mergers and acquisitions. Deal details are extremely confidential -- even the existence of a deal is market-sensitive information. Which architecture pattern?
Building a PII detection layer
PII detection is the technical foundation of privacy-preserving AI architectures. It is also where most implementations are weaker than they think.
What counts as PII:
Under GDPR, personal data is any information relating to an identified or identifiable natural person. This includes the obvious (name, email, phone, address, date of birth, national ID numbers) and the less obvious:
IP addresses
Cookie identifiers
Location data
Biometric data
Genetic data
Political opinions, religious beliefs (special category data)
Health information
Trade union membership
Data that could identify someone in combination with other data
# Using Microsoft Presidio (open source PII detection)from presidio_analyzer import AnalyzerEnginefrom presidio_anonymizer import AnonymizerEngineanalyzer = AnalyzerEngine()anonymizer = AnonymizerEngine()text = "Dr. Sarah Thompson reviewed the case at St Mary's Hospital on 15 March."results = analyzer.analyze(text=text, language='en')# Detects: PERSON (Sarah Thompson), LOCATION (St Mary's Hospital), DATE_TIME (15 March)anonymized = anonymizer.anonymize(text=text, analyzer_results=results)# Output: "<PERSON> reviewed the case at <LOCATION> on <DATE_TIME>."
Presidio combines rule-based detection with spaCy NER models. It is production-ready, supports multiple languages, and is extensible with custom recognisers.
Layer 3: Context-aware detection (LLM-based)
For the highest quality PII detection, use your local LLM:
async def detect_contextual_pii(text: str, local_model) -> list: response = await local_model.generate(f"""Identify all personally identifiable information in the following text.Include: names, addresses, phone numbers, emails, dates that could identify someone,organisation names that reveal confidential business relationships,and any other information that could directly or indirectly identify a person.Format each finding as: [TYPE]: "exact text"Text: {text}Findings:""", temperature=0, max_tokens=500) return parse_findings(response)
This catches contextual PII that neither regex nor standard NER detects: nicknames, indirect references ("my manager" in a small team), organisation names that reveal relationships, and combinations of innocuous data that together identify someone.
The cost of false negatives vs false positives:
False negative (missed PII): A privacy violation. Sensitive data reaches the cloud API. Depending on the data and regulation, this could be a reportable breach.
False positive (over-redaction): The sanitised query loses context, and the cloud model's answer quality degrades. Annoying but not a compliance violation.
For privacy-critical deployments, bias toward false positives. It is better to over-redact and get a slightly worse answer from the cloud model than to miss PII and create a compliance exposure.
Logging inference without logging content
Enterprise AI deployments need audit trails for compliance, security, and operational monitoring. But logging the content of AI queries and responses creates its own privacy risk -- you are now storing potentially sensitive data in log files.
The solution: metadata logging.
Log everything about the inference except the content:
This log entry tells you: who used AI, when, which model, how many tokens, whether PII was detected, what classification the data had, and how the query was routed. It does not contain any of the actual query text or response text.
When you must log content:
Some compliance frameworks (FDA 21 CFR Part 11, certain financial regulations) require that the actual inputs and outputs of AI systems be logged for audit purposes. In this case:
Log content to an encrypted, access-controlled audit store -- separate from operational logs
Apply the same retention and access policies as the underlying data classification
Ensure the audit store itself is within your compliance boundary (on-premises for ITAR, EU-resident for GDPR)
This is the practical reference table. For each major regulation, which of the four privacy architecture patterns satisfies its requirements.
Requirement
Pattern 1 (Local)
Pattern 2 (On-prem)
Pattern 3 (VPC Cloud)
Pattern 4 (Hybrid)
GDPR: data minimisation
Yes
Yes
Yes
Partial (sanitised data sent)
GDPR: cross-border transfer
Yes (no transfer)
Yes (if EU-located)
Yes (if EU region)
Depends on cloud location
GDPR: right to erasure
User deletes locally
Central deletion
Central deletion
Complex (local + cloud)
HIPAA: PHI protection
Yes
Yes (with BAA for hosting)
Depends on provider BAA
Only if PHI never reaches cloud
ITAR: US person access only
N/A (device-level)
Yes (if US-person infra)
Rarely compliant
Not compliant for controlled data
FedRAMP High
N/A
Yes (if certified infra)
Only with authorised provider
Not compliant
SOC 2 Type II
N/A (no service)
Yes (with proper controls)
Yes (with provider attestation)
Yes (with proper controls)
PCI DSS
Yes (no transmission)
Yes (with segmentation)
Yes (with proper scoping)
Only if card data never reaches cloud
CCPA/CPRA
Yes
Yes
Yes
Yes (with disclosure)
Key takeaways:
ITAR and FedRAMP High effectively require Pattern 1 or Pattern 2. Cloud options are extremely limited.
GDPR is satisfiable with any pattern if implemented correctly, but Pattern 4 adds complexity around cross-border transfers and right to erasure.
HIPAA requires Pattern 2 or higher for PHI. Pattern 4 works only if the PII gateway guarantees PHI never reaches the cloud -- which is hard to guarantee with imperfect PII detection.
SOC 2 is the most flexible -- it cares about controls being in place and operating effectively, not about where the data physically resides.
?
Your organisation must comply with both GDPR (EU customer data) and SOC 2 Type II. You want to use a hybrid architecture (Pattern 4). What is the critical compliance risk?
✎
Module 10 -- Final Assessment
1
Why should a PII detection system be biased toward false positives (over-redaction) rather than false negatives (missed PII)?
2
What is the primary advantage of metadata logging over full content logging for AI inference audit trails?
3
Which privacy architecture pattern is required for ITAR-controlled technical data?
4
Under GDPR, which of the following qualifies as personal data that a PII detection system should catch?