|Ben @ Grepture|Compliance

GDPR-Compliant AI: A Developer's Practical Guide

How to build AI features that comply with GDPR — data minimization, processing agreements, PII redaction, and enforcement trends for 2026.

GDPR didn't anticipate LLMs, but it still applies

When GDPR came into force in 2018, nobody was sending customer data to large language models. But the regulation was designed to be technology-neutral, and it maps cleanly onto AI workloads — sometimes uncomfortably so.

Every prompt containing personal data is a processing operation. Every LLM provider is a data processor. Every response cached with user context is stored personal data. If you're building AI features in Europe — or serving European users — GDPR compliance isn't optional.

The problem: most GDPR guidance for AI is written by lawyers, for lawyers. This guide is the engineering version. What you actually need to implement, why, and how.

The regulatory landscape in 2026

GDPR enforcement around AI has escalated sharply:

  • Italy's Garante temporarily banned ChatGPT in 2023 and has continued scrutinizing AI services, issuing a 15 million EUR fine to OpenAI in early 2025
  • France's CNIL published dedicated AI guidance and launched sector-specific audits of AI-powered customer service tools
  • Germany's data protection authorities (Datenschutzkonferenz) have been among the most active, with state-level DPAs issuing binding guidance on AI in the workplace and public sector
  • The EDPB (European Data Protection Board) issued its opinion on AI models and GDPR in 2024, clarifying that models trained on unlawfully processed personal data may themselves be in violation

Meanwhile, the EU AI Act adds a second compliance layer. High-risk AI systems face additional requirements by August 2026 — conformity assessments, quality management, human oversight. For a deep dive on the AI Act timeline, see our EU AI Act compliance guide.

The intersection matters: GDPR and the AI Act aren't alternatives. They stack. You need to comply with both.

Data minimization: the principle that changes everything

GDPR Article 5(1)(c) requires that personal data be "adequate, relevant and limited to what is necessary." In German data protection law, this maps to Datensparsamkeit — data parsimony. Collect and process the minimum data required.

For AI features, this has concrete implications:

Don't send what the model doesn't need. If a user asks your AI assistant to summarize their order status, the model doesn't need the user's full name, email, phone number, and shipping address. Strip everything except what's required for the task.

// Before: full context dump
const prompt = `Summarize the order status for:
  Name: ${user.name}
  Email: ${user.email}
  Phone: ${user.phone}
  Order #${order.id}: ${order.status}, shipped ${order.shippedDate}
  Address: ${order.shippingAddress}`;

// After: minimum necessary data
const prompt = `Summarize the order status:
  Order #${order.id}: ${order.status}, shipped ${order.shippedDate}`;

This seems obvious, but in practice, teams concatenate entire database records into prompts because it's easier than selecting fields. Data minimization requires intentional prompt construction.

Redact PII before it leaves your infrastructure. Even with careful prompt design, user-generated content (support tickets, chat messages, form inputs) will contain PII you didn't expect. Automated detection catches what manual prompt design misses. Our PII detection best practices guide covers detection strategies in depth.

Don't cache responses containing personal data without a retention policy. If you cache LLM responses for performance, those caches contain processed personal data. Set TTLs. Delete when no longer needed.

Processing agreements: the paperwork that matters

Under GDPR Article 28, when you send personal data to a data processor (any LLM provider), you need a Data Processing Agreement (DPA) — or in German, an Auftragsverarbeitungsvertrag (AVV).

Every major LLM provider offers one:

ProviderDPA availableZero-retention optionEU data residency
OpenAIYes (via API terms)Yes (API, not ChatGPT)No (US processing)
AnthropicYesYes (API)No (US processing)
Google (Vertex AI)YesConfigurableYes (eu-west regions)
Azure OpenAIYesConfigurableYes (EU regions)
AWS BedrockYesConfigurableYes (eu-west-1, etc.)
Mistral (La Plateforme)YesYesYes (EU-native)

What to check in every DPA:

  1. Sub-processors — Who else touches the data? Most providers use cloud sub-processors (AWS, GCP, Azure). The DPA should list them.
  2. Data location — Where is data processed and stored? For EU-only requirements, US-based providers are problematic unless they offer EU regions.
  3. Retention and deletion — How long does the provider keep your data? Zero-retention API options exist but must be explicitly enabled.
  4. Purpose limitation — The DPA should confirm data is processed only for providing the service, not for model training.
  5. Audit rights — GDPR Article 28(3)(h) gives you the right to audit your processor. In practice, this means SOC 2 reports and compliance certifications.

The international transfer problem

Sending personal data to a US-based LLM provider is an international data transfer under GDPR Chapter V. After the Schrems II ruling invalidated Privacy Shield, transfers to the US rely on:

  • EU-US Data Privacy Framework (DPF) — Adopted in 2023, provides adequacy for certified US companies. OpenAI, Google, Microsoft, and Amazon are certified. But the DPF faces ongoing legal challenges.
  • Standard Contractual Clauses (SCCs) — The fallback. Most DPAs include SCCs.
  • EU-hosted processing — The cleanest solution. Azure OpenAI in West Europe, AWS Bedrock in eu-west-1, or Mistral's EU-native platform avoid the transfer issue entirely.

If you need maximum legal certainty, process data within the EU. A proxy that routes traffic through EU infrastructure — before it reaches any provider — simplifies this significantly.

Lawful basis: which one applies?

GDPR Article 6 requires a lawful basis for every processing operation. For AI features, three are relevant:

Legitimate interest (Article 6(1)(f))

The most common basis for AI features. You have a legitimate interest in providing AI-powered services. But you must conduct a Legitimate Interest Assessment (LIA) that balances your interest against the data subject's rights.

A LIA for AI features should document:

  • What personal data is processed in prompts
  • Why the processing is necessary (not just convenient)
  • What safeguards are in place (PII redaction, zero-retention, encryption)
  • Why the data subject's interests don't override yours

Contract performance (Article 6(1)(b))

If the AI feature is part of a service the user contracted for (e.g., an AI assistant in a SaaS product), you can argue contract performance. This is stronger than legitimate interest but narrower — it only covers processing necessary for delivering the contracted service.

Consent (Article 6(1)(a))

Consent works but introduces operational complexity. It must be freely given, specific, informed, and withdrawable. If a user withdraws consent, you must stop processing their data in AI calls immediately. For most B2B SaaS features, legitimate interest or contract performance is more practical.

Special categories (Article 9): If prompts might contain health data, biometric data, racial or ethnic origin, or other special categories, you need explicit consent or another Article 9 exception. This is common in healthcare, HR, and financial services AI features.

Technical measures that demonstrate compliance

GDPR Article 32 requires "appropriate technical and organisational measures." For AI features, this means:

1. PII redaction at the boundary

Strip personal data from prompts before they leave your infrastructure. This is the single most impactful technical measure for GDPR compliance — it turns a compliance-complex data transfer into a low-risk one.

Redaction approaches:

  • Irreversible redaction — Replace PII with [REDACTED]. Simple, safe, but the AI response loses personalization.
  • Reversible redaction (mask-and-restore) — Replace PII with tokens, store originals in a vault, restore in the response. The model never sees real data, but the user gets personalized output. See our mask-and-restore deep dive.

2. Audit logging

You need to demonstrate what data was processed, when, and how it was protected. Log:

  • What PII was detected and how it was handled (redacted, masked, blocked)
  • Which LLM provider processed the request
  • Whether zero-retention was enabled
  • Response timestamps for retention policy enforcement

An audit trail makes DPA audits and regulator inquiries straightforward instead of panic-inducing.

3. Zero-retention API configuration

Enable zero-retention / no-training options on every LLM provider you use. This is table stakes:

  • OpenAI API: Data not used for training by default since March 2023. Verify in your organization settings.
  • Anthropic API: No training on API data by default.
  • Azure OpenAI: Abuse monitoring data stored for 30 days by default. Can be disabled for approved customers.

4. Encryption in transit and at rest

TLS 1.2+ for all API calls (every major provider enforces this). Encrypt stored prompts, responses, and logs at rest. This is baseline — not differentiating, but required.

5. Access controls

Limit who can access AI logs, prompts, and responses containing personal data. Role-based access with the principle of least privilege. Log access for audit purposes.

Data subject rights and AI

GDPR gives data subjects rights that apply to AI processing:

  • Right of access (Article 15) — Users can request what personal data you've sent to AI providers. If you log prompts, you need to be able to retrieve them per user.
  • Right to erasure (Article 17) — Users can request deletion. This includes cached responses, logged prompts, and any stored context. If the data was sent to an LLM provider, you need the provider to delete it too (which zero-retention simplifies).
  • Right to object (Article 21) — Users can object to processing based on legitimate interest. You must stop unless you demonstrate compelling legitimate grounds.
  • Right to explanation (Article 22) — If AI makes decisions with legal or significant effects (loan approvals, hiring), users have the right to meaningful information about the logic, significance, and consequences.

The practical implication: your AI pipeline needs per-user data retrieval and deletion capabilities. Build this early — retrofitting is painful.

A compliance checklist for engineering teams

Before shipping AI features to production:

  • DPA signed with every LLM provider you use
  • Lawful basis documented (LIA for legitimate interest, or contract/consent)
  • Data minimization — prompts contain only necessary data
  • PII redaction — automated detection and handling before data leaves your infra
  • Zero-retention enabled on all provider accounts
  • EU data residency evaluated (mandatory for some industries/jurisdictions)
  • Audit logging — what was sent, when, how it was protected
  • Retention policy — cached responses and logs have TTLs
  • Data subject rights — can retrieve and delete per-user data
  • Privacy impact assessment (DPIA) — required for high-risk processing under Article 35
  • Record of processing activities — updated to include AI processing operations

How Grepture helps

Implementing GDPR-compliant AI from scratch means building PII detection, redaction logic, audit logging, and retention policies — all before you write a single AI feature.

Grepture handles this at the proxy layer. Every API call to an LLM provider flows through Grepture, where personal data is detected and redacted or masked before reaching the provider. Every request is logged with what was detected and how it was handled — giving you the audit trail GDPR requires.

Because it works at the network level, you don't integrate per-service or per-team. One proxy covers every AI call across your organization. For teams that need EU data residency, Grepture's managed service runs in Frankfurt.

Key takeaways

  • GDPR applies to every AI API call containing personal data — each prompt is a processing operation with a third-party processor.
  • Data minimization is your highest-leverage compliance measure — strip personal data from prompts before they leave your infrastructure.
  • Sign DPAs with every LLM provider and verify zero-retention, sub-processor lists, and data location.
  • The EU AI Act stacks on top of GDPR — high-risk AI systems face additional requirements by August 2026.
  • Build audit logging and per-user data retrieval early — data subject rights requests will come, and retrofitting is expensive.