Zero-Retention AI Processing: How to Use LLMs Without Storing Data

What is zero-retention AI processing

Zero-retention AI processing means sending data through an AI model without writing the request or response content to persistent storage. The data exists in memory during processing and is discarded immediately after. Nothing touches disk.

Only operational metadata is logged — HTTP method, status code, latency, which detection rules fired, and request identifiers. No prompt text, no response bodies, no headers, no URLs containing query parameters.

This is different from "we delete your data after 30 days." Zero-retention means the data is never stored in the first place. There's nothing to delete, nothing to leak in a breach, and nothing to produce in response to a subpoena.

Why it matters

Personal data must be "adequate, relevant and limited to what is necessary." If you don't need to store the content of AI API calls, storing them violates data minimization. Zero-retention is the most direct implementation of this principle — you process the data, use the result, and discard the input.

Storage limitation (Article 5(1)(e))

Personal data must be kept "for no longer than is necessary." Zero-retention sets that duration to zero. There's no retention period to manage, no deletion schedule to maintain, and no risk of data outliving its purpose.

Right to erasure simplification

GDPR Article 17 gives data subjects the right to have their personal data erased. If a user asks you to delete their data, you need to find and delete every copy — including AI API logs, cached prompts, and stored responses. With zero-retention, there's nothing to find and nothing to delete.

Breach risk reduction

You can't leak data you don't have. A security breach at your AI proxy layer can't expose prompt content if that content was never written to disk. This dramatically reduces your breach notification obligations under GDPR Article 33 and simplifies incident response.

Zero-retention vs. standard processing

Aspect	Standard processing	Zero-retention
Request bodies	Stored in logs	Never written to disk
Response bodies	Stored in logs	Never written to disk
Headers	Stored (may contain PII)	Not stored
URLs	Stored (may contain query params)	Not stored
Operational metadata	Stored	Stored (method, status, latency)
Detection rule results	Stored with context	Stored (rule ID + hit count only)
Audit capability	Full replay of requests	Metadata-only audit trail
Debugging	Full request/response inspection	Metadata + reproduce in dev
Breach exposure	All stored content at risk	Only metadata at risk
DSAR response	Must search and delete content	No content to find or delete
Storage costs	Scales with traffic	Near-zero

The tradeoff is clear: you lose the ability to inspect historical request content in production. For most compliance-sensitive workloads, that's a feature, not a bug.

When to use zero-retention

Use it when:

Healthcare (PHI) — Patient data in clinical decision support, medical record summarization, or diagnostic assistance. HIPAA's minimum necessary standard aligns directly with zero-retention.
Financial services (PCI) — Payment card data, transaction details, or account information in fraud detection or customer service automation. PCI DSS data retention requirements are simplified.
Legal (privilege) — Attorney-client privileged communications processed through AI for document review, contract analysis, or case research. Privilege can be waived by disclosure — zero-retention minimizes that risk.
HR and employment — Employee records, salary data, performance reviews, or candidate information used in AI-assisted HR workflows.
Any high-sensitivity workflow — Where the risk of data exposure outweighs the benefit of logging.

When NOT to use it:

Debugging in development — You need full request/response logs to troubleshoot issues. Use standard processing in dev/staging and zero-retention in production.
Compliance audit requiring full logs — Some regulations (e.g., certain financial services requirements) mandate retaining full transaction records. Zero-retention doesn't satisfy these. Check with your compliance team.
Model evaluation and monitoring — If you need to evaluate AI output quality over time, you need stored responses. Consider running evaluation on a separate, controlled pipeline with appropriate data handling.
Incident forensics — After a production incident, you may need request logs to understand what happened. With zero-retention, you'll only have metadata. Plan your debugging strategy accordingly.

How Grepture implements zero-retention

In Grepture, zero-retention is a single toggle in the dashboard. No code changes, no SDK updates, no configuration files.

When enabled:

Detection rules still fire — PII detection, secret scanning, and redaction rules run on every request, exactly as they would in standard mode. Your security posture doesn't change.
Mask-and-restore still works — If you're using reversible redaction, tokens are generated, PII is masked in the outbound request, and original values are restored in the response. The token-to-value mappings exist only in memory and are discarded after the response is delivered.
Only metadata is logged — HTTP method, status code, response time, rule hit counts, and request identifiers. Enough to monitor system health and detect anomalies. Not enough to reconstruct any request content.
No request/response bodies on disk — Prompt text, completion text, headers, and URLs are never written to persistent storage. They exist in memory for the duration of the request and are garbage-collected after.

Dashboard → Project Settings → Data Retention → Zero-Retention Mode: ON

That's it. Every request through that project's proxy endpoint is now zero-retention.

Architecture: how in-memory processing works

Zero-retention isn't just "don't log things." It's a deliberate architecture that ensures data never reaches persistent storage:

Request flow:

Your application sends a request to the Grepture proxy
The request is received into an in-memory buffer (never written to a disk-backed queue)
Detection rules run against the in-memory content
If mask-and-restore is enabled, a token map is generated in memory
The sanitized request is forwarded to the AI provider over TLS 1.3
The response is received into an in-memory buffer
Token restoration runs against the in-memory response
The restored response is streamed back to your application
All in-memory buffers are zeroed and released

What's stored:

Operational metadata: { method: "POST", status: 200, latency_ms: 1243, rules_fired: ["email", "phone", "iban"], request_id: "req_abc123" }
That's it.

What's NOT stored:

Request bodies (prompts, messages, system instructions)
Response bodies (completions, generated text)
HTTP headers (including authorization headers)
URL paths or query parameters
Token-to-value mappings from mask-and-restore
Any intermediate processing state

Ephemeral token storage: When mask-and-restore is active, the token map (e.g., [PERSON_1] → "Maria Schmidt") exists only in memory for the duration of the request. For streaming responses, the map is held until the stream completes, then immediately discarded. There is no disk-backed fallback.

Infrastructure: Grepture's proxy runs in Frankfurt (eu-central-1). Even in standard processing mode, data stays in the EU. In zero-retention mode, data doesn't even stay in the proxy — it passes through and is gone.

Compliance mapping

Requirement	How zero-retention helps
Data minimization (Art. 5(1)(c))	No unnecessary data stored — processing only
Storage limitation (Art. 5(1)(e))	Retention period is zero
Right to erasure (Art. 17)	No stored data to erase
Data protection by design (Art. 25)	Zero-retention is a technical measure implementing minimization by default
Breach notification (Art. 33)	Reduced scope — metadata-only breach has lower impact
Records of processing (Art. 30)	Metadata audit trail satisfies record-keeping requirements

EU AI Act

Requirement	How zero-retention helps
Data governance (Art. 10)	Demonstrates controlled, minimized data handling in AI systems
Record-keeping (Art. 12)	Metadata logs provide required operational records without storing personal data
Transparency (Art. 13)	Audit trail shows what rules fired and when, without exposing data

For a deeper look at EU AI Act obligations, see the EU AI Act compliance guide for engineers.

HIPAA

Requirement	How zero-retention helps
Minimum necessary (§164.502(b))	Only metadata retained — PHI is not stored
Access controls (§164.312(a))	No stored PHI to control access to
Audit controls (§164.312(b))	Metadata logs provide audit capability without PHI exposure

PCI DSS

Requirement	How zero-retention helps
Data retention (Req. 3.1)	Cardholder data is not retained
Rendering data unreadable (Req. 3.4)	Not applicable — data is never stored
Logging and monitoring (Req. 10)	Metadata logs satisfy monitoring requirements

Getting started

1. Enable zero-retention in the dashboard

Navigate to your project settings and toggle zero-retention mode. This applies to all requests through that project's proxy endpoint.

2. Verify with the audit log

After enabling, make a few test requests and check the audit log. You should see metadata entries (method, status, latency, rules fired) but no request or response content. If you see content, zero-retention isn't active — check your project settings.

3. Combine with PII redaction

Zero-retention and PII redaction are complementary. Even with zero-retention enabled, you should still run detection rules to catch and redact PII before it reaches the AI provider. Zero-retention prevents your proxy from storing the data. Redaction prevents the AI provider from seeing it. Use both.

For a complete guide on making your AI API calls GDPR-compliant — including data processing agreements, lawful basis, and cross-border transfers — see How to Make AI API Calls GDPR-Compliant.

4. Confirm with your compliance team

Share this with your DPO or compliance officer. Zero-retention is a strong technical measure, but it needs to fit into your broader data protection strategy. Your DPIA should reference it, your processing records should document it, and your privacy policy should reflect it.

Zero-Retention AI Processing: How to Use LLMs Without Storing Data

What is zero-retention AI processing

Why it matters

Storage limitation (Article 5(1)(e))

Right to erasure simplification

Breach risk reduction

Zero-retention vs. standard processing

When to use zero-retention

How Grepture implements zero-retention

Architecture: how in-memory processing works

Compliance mapping

EU AI Act

HIPAA

PCI DSS

Getting started

1. Enable zero-retention in the dashboard

2. Verify with the audit log

3. Combine with PII redaction

4. Confirm with your compliance team

Further reading

Protect your API traffic today

Zero-Retention AI Processing: How to Use LLMs Without Storing Data

What is zero-retention AI processing

Why it matters

GDPR data minimization (Article 5(1)(c))

Storage limitation (Article 5(1)(e))

Right to erasure simplification

Breach risk reduction

Zero-retention vs. standard processing

When to use zero-retention

How Grepture implements zero-retention

Architecture: how in-memory processing works

Compliance mapping

GDPR

EU AI Act

HIPAA

PCI DSS

Getting started

1. Enable zero-retention in the dashboard

2. Verify with the audit log

3. Combine with PII redaction

4. Confirm with your compliance team

Further reading

Protect your API traffic today