|Ben @ Grepture

EU AI Act Compliance for AI Engineers: What You Need to Do Before August 2026

A developer-focused guide to EU AI Act compliance before the August 2026 deadline — requirements, data governance, and practical steps.

The August 2026 deadline is real

The EU AI Act entered into force in August 2024. The obligations for high-risk AI systems take full effect in August 2026. That's soon.

It's the world's first comprehensive AI regulation. Penalties go up to 35 million EUR or 7% of global annual turnover — whichever is higher. GDPR maxes out at 20 million EUR or 4%. The AI Act goes further.

If you're building AI features — support bots, document processing, decision support, anything using an LLM — you almost certainly have obligations. The question isn't whether the AI Act applies to you. It's which requirements, and whether you'll be ready.

This post covers what engineering teams need to do before August 2026. Practical steps, no legal jargon.

Who it applies to: providers vs deployers

The AI Act distinguishes between two key roles:

  • Provider: The entity that develops an AI system or has one developed, and places it on the market or puts it into service under its own name or trademark.
  • Deployer: The entity that uses an AI system under its authority. Not the end user — the organization that integrates the AI system into its operations.

If you're building features on top of OpenAI, Anthropic, Google, or any other third-party LLM, you're almost certainly a deployer. The LLM provider is the provider. You still have obligations — different ones, but real ones.

Some teams are both. If you fine-tune a model or build a system that combines multiple AI components into something new, you might be classified as a provider for that system. The classification matters because providers face stricter requirements.

Risk classification

The AI Act defines four risk tiers:

  1. Unacceptable risk — Banned outright. Social scoring, real-time remote biometric identification in public spaces (with exceptions), manipulation of vulnerable groups. If you're doing any of these, stop.
  2. High-risk — Strict requirements. AI systems used in employment decisions, creditworthiness assessments, education admissions, critical infrastructure, law enforcement, migration/border control. Requires conformity assessments, quality management systems, human oversight, and extensive documentation.
  3. Limited risk — Transparency obligations. Chatbots must disclose they're AI. AI-generated content must be labeled. Emotion recognition and biometric categorization systems must inform users.
  4. Minimal risk — No specific obligations under the AI Act (though GDPR still applies).

Here's the thing most engineering teams miss: many enterprise AI features land in the limited or high-risk categories. A customer support bot that influences service outcomes? Potentially high-risk if it makes or significantly influences decisions about people. A document processing system that extracts data for insurance claims? High-risk. An internal tool that helps HR screen candidates? High-risk.

Even if your system is limited risk, you still need transparency measures. And if you're processing personal data (you probably are), GDPR applies regardless of AI Act classification.

Requirements that matter for engineering teams

Three articles have direct implications for how you build and operate AI features.

Article 10: Data governance

High-risk AI systems must be developed using training, validation, and testing datasets that are subject to appropriate data governance and management practices. For deployers using third-party LLMs, this translates to: what data you send to the model matters.

You need to:

  • Understand what data flows into your AI systems
  • Ensure input data is relevant and representative
  • Identify and address potential biases
  • Document your data governance practices

For engineering teams, this means you need visibility into what your application sends to LLM providers. Not a vague sense of "we send customer messages" — actual, auditable knowledge of what personal data flows through your AI pipeline.

Article 13: Transparency

AI systems must be designed to allow deployers to interpret outputs and use the system appropriately. This means:

  • Logging what goes into the model and what comes out
  • Understanding how the system behaves across different inputs
  • Providing documentation on intended use and limitations
  • Enabling human oversight of the system's operation

For engineers: you need audit trails. Every request to an LLM — what data it contained, what rules were applied, what came back. Not just for debugging. For compliance evidence.

Article 15: Accuracy, robustness, and cybersecurity

AI systems must achieve appropriate levels of accuracy, robustness, and cybersecurity. They must be resilient to attempts by unauthorized third parties to alter their use or performance by exploiting system vulnerabilities.

Read that last part again. The AI Act explicitly requires resilience against adversarial attacks on AI systems. This includes prompt injection — attempts to manipulate model behavior through crafted inputs. Under Art. 15, prompt injection prevention isn't a nice-to-have security measure. It's a compliance requirement.

If you're not already detecting and blocking prompt injection attempts, this is a regulatory reason to start. See our prompt injection prevention guide for the technical details.

Where AI Act meets GDPR: the double compliance challenge

If you're operating in the EU (or processing EU residents' data), you already know GDPR. The AI Act doesn't replace GDPR — it adds a layer on top of it. And the two regulations reinforce each other in ways that matter for AI engineering.

Data minimization hits different with LLMs

GDPR Article 5 requires data minimization: only process personal data that's necessary for the specified purpose. This always applied to AI, but with LLMs it gets pointed.

Every customer name, email address, physical address, phone number, or account identifier in a prompt needs justification. Are you sending it because the model needs it to generate a useful response? Or because it was in the source data and nobody stripped it out?

Most teams we talk to are surprised by how much PII flows through their AI pipelines when they actually look. Support ticket text gets sent verbatim. Customer records get dumped into context windows. Internal documents with employee names and contact details get used as retrieval-augmented generation sources. All of this is personal data processing under GDPR.

Third-party LLMs and data processing

Sending personal data to a third-party LLM provider constitutes data processing under GDPR Article 28. You need a Data Processing Agreement (DPA) with every AI provider you use. Most major providers offer these, but you need to actually have them in place and ensure your usage stays within the agreed terms.

The AI Act's data governance requirements under Art. 10 reinforce this. You need to document what data goes where, why, and under what controls. The two regulations create overlapping obligations that point in the same direction: know your data, minimize it, protect it, document it.

The practical bottom line

Automated PII detection and redaction isn't optional anymore. It's a compliance requirement from two directions — GDPR's data minimization and the AI Act's data governance. If personal data doesn't need to be in the prompt, it shouldn't be. And you need a systematic way to enforce that, not just good intentions.

For a deeper dive into detection strategies, see our post on PII detection best practices.

Practical compliance checklist for engineering teams

Roughly in order of priority:

1. Audit your AI inventory

Map every service, endpoint, and feature in your stack that uses an LLM or AI model. For each one, document:

  • Which AI provider/model is used
  • What data flows to the model (request payloads)
  • What data comes back (response payloads)
  • Who has access to the integration
  • Whether personal data is involved (it almost certainly is)

You can't protect what you don't know about. Shadow AI — teams spinning up LLM integrations without central visibility — is a real problem. Start by getting a complete picture.

2. Classify data sensitivity per endpoint

Not all AI integrations carry the same risk. A code completion tool processing internal codebases is different from a customer-facing chatbot processing support conversations. Classify each integration by:

  • Data sensitivity: Does it handle PII? Sensitive categories? Financial data? Health data?
  • Risk level: Under the AI Act, would this be high-risk, limited, or minimal?
  • Exposure: Is the data sent to a third-party provider, or processed locally?

3. Implement automated PII detection

This is the technical core of compliance. You need detection at the network layer that catches personal data before it leaves your infrastructure.

  • Regex patterns for structured PII: emails, phone numbers, credit card numbers, social security numbers, IBANs. These are deterministic, auditable, and fast.
  • AI-powered detection for unstructured PII: person names, locations, organizations in freeform text. Local AI models catch what regex can't.
  • DLP rules for business-sensitive data: source code, API keys, internal identifiers.

Detection should be centralized, not per-service. If every team implements their own PII scrubbing, you get inconsistent coverage and no unified audit trail. A proxy means every AI request gets the same treatment.

4. Enable audit logging

Every request to an LLM should generate an audit record:

  • Timestamp
  • Source service/endpoint
  • What detection rules were applied
  • What was detected (categories, not raw PII)
  • What action was taken (redacted, masked, blocked, passed)

This is your compliance evidence. When a regulator asks "how do you ensure personal data isn't unnecessarily sent to AI providers?", you point to the logs.

5. Add prompt injection detection

Art. 15 requires robustness against adversarial manipulation. Score incoming requests for injection risk and define policies for handling high-risk inputs — block, log, or alert.

6. Document everything

Compliance is partly a paperwork exercise. Document:

  • Your AI system inventory
  • Risk classifications for each system
  • Data governance practices
  • Detection rules and their rationale
  • Incident response procedures
  • Human oversight mechanisms

This documentation forms your compliance evidence package. Build it as you go, not the week before enforcement starts.

How proxy-level protection fits the compliance picture

A proxy-based approach has specific advantages for compliance.

Centralized audit trail: A single proxy handling all AI traffic gives you one place for compliance evidence. Every request, every detection, every action — logged consistently across all services and providers. No stitching together logs from different teams or services. Grepture's activity log and compliance reports (Business+ plans) are designed specifically for this.

Auditable detection: Regulators don't just want to know that you protect data — they want to know how. Deterministic regex rules are fully explainable: "this pattern matches email addresses, and we redact them before forwarding to the AI provider." AI-powered detection adds coverage for unstructured PII. The combination gives you both breadth and auditability.

Data minimization by design: Grepture's mask-and-restore feature replaces real values with secure tokens before the LLM sees them, then swaps them back in the response. The model never processes the actual personal data. This is data minimization in the strongest sense — the data literally never reaches the processor.

Zero-data mode: For maximum data minimization, Business+ plans offer zero-data mode — requests are processed entirely in memory with no bodies or headers stored. You get protection without creating new data stores to worry about.

EU-hosted infrastructure: Data never leaves the EU. For organizations subject to data residency requirements, this eliminates one more compliance concern. All AI detection models run locally on Grepture infrastructure — your data isn't forwarded to yet another third party for the purpose of protecting it.

Drop-in integration: You can add these controls without rewriting your AI stack. Install the SDK (npm install @grepture/sdk), wrap your existing OpenAI or Anthropic client, and your traffic starts flowing through the proxy. Compliance shouldn't require a six-month engineering project.

Key takeaways

The EU AI Act is a set of technical requirements that directly affect how you build and run AI features. Most of these obligations already existed under GDPR. The AI Act makes them explicit and adds real enforcement.

The bottom line: know what data flows through your AI systems, minimize it, protect it, log it, and document your controls. Automated PII detection, prompt injection prevention, and audit logging aren't just good practices anymore — they're regulatory requirements.

If you want to get started, the quickstart guide will have you up and running in minutes. For a deeper look at detection strategies, check out PII detection best practices.