Zero-Retention AI Processing: How to Use LLMs Without Storing Data
Process data through AI models without writing request content to disk. Learn how zero-retention AI processing works, when to use it, and how it maps to GDPR, HIPAA, and PCI DSS requirements.
What is zero-retention AI processing
Zero-retention AI processing means sending data through an AI model without writing the request or response content to persistent storage. The data exists in memory during processing and is discarded immediately after. Nothing touches disk.
Only operational metadata is logged — HTTP method, status code, latency, which detection rules fired, and request identifiers. No prompt text, no response bodies, no headers, no URLs containing query parameters.
This is different from "we delete your data after 30 days." Zero-retention means the data is never stored in the first place. There's nothing to delete, nothing to leak in a breach, and nothing to produce in response to a subpoena.
Why it matters
GDPR data minimization (Article 5(1)(c))
Personal data must be "adequate, relevant and limited to what is necessary." If you don't need to store the content of AI API calls, storing them violates data minimization. Zero-retention is the most direct implementation of this principle — you process the data, use the result, and discard the input.
Storage limitation (Article 5(1)(e))
Personal data must be kept "for no longer than is necessary." Zero-retention sets that duration to zero. There's no retention period to manage, no deletion schedule to maintain, and no risk of data outliving its purpose.
Right to erasure simplification
GDPR Article 17 gives data subjects the right to have their personal data erased. If a user asks you to delete their data, you need to find and delete every copy — including AI API logs, cached prompts, and stored responses. With zero-retention, there's nothing to find and nothing to delete.
Breach risk reduction
You can't leak data you don't have. A security breach at your AI proxy layer can't expose prompt content if that content was never written to disk. This dramatically reduces your breach notification obligations under GDPR Article 33 and simplifies incident response.
Zero-retention vs. standard processing
| Aspect | Standard processing | Zero-retention |
|---|---|---|
| Request bodies | Stored in logs | Never written to disk |
| Response bodies | Stored in logs | Never written to disk |
| Headers | Stored (may contain PII) | Not stored |
| URLs | Stored (may contain query params) | Not stored |
| Operational metadata | Stored | Stored (method, status, latency) |
| Detection rule results | Stored with context | Stored (rule ID + hit count only) |
| Audit capability | Full replay of requests | Metadata-only audit trail |
| Debugging | Full request/response inspection | Metadata + reproduce in dev |
| Breach exposure | All stored content at risk | Only metadata at risk |
| DSAR response | Must search and delete content | No content to find or delete |
| Storage costs | Scales with traffic | Near-zero |
The tradeoff is clear: you lose the ability to inspect historical request content in production. For most compliance-sensitive workloads, that's a feature, not a bug.
When to use zero-retention
Use it when:
- Healthcare (PHI) — Patient data in clinical decision support, medical record summarization, or diagnostic assistance. HIPAA's minimum necessary standard aligns directly with zero-retention.
- Financial services (PCI) — Payment card data, transaction details, or account information in fraud detection or customer service automation. PCI DSS data retention requirements are simplified.
- Legal (privilege) — Attorney-client privileged communications processed through AI for document review, contract analysis, or case research. Privilege can be waived by disclosure — zero-retention minimizes that risk.
- HR and employment — Employee records, salary data, performance reviews, or candidate information used in AI-assisted HR workflows.
- Any high-sensitivity workflow — Where the risk of data exposure outweighs the benefit of logging.
When NOT to use it:
- Debugging in development — You need full request/response logs to troubleshoot issues. Use standard processing in dev/staging and zero-retention in production.
- Compliance audit requiring full logs — Some regulations (e.g., certain financial services requirements) mandate retaining full transaction records. Zero-retention doesn't satisfy these. Check with your compliance team.
- Model evaluation and monitoring — If you need to evaluate AI output quality over time, you need stored responses. Consider running evaluation on a separate, controlled pipeline with appropriate data handling.
- Incident forensics — After a production incident, you may need request logs to understand what happened. With zero-retention, you'll only have metadata. Plan your debugging strategy accordingly.
How Grepture implements zero-retention
In Grepture, zero-retention is a single toggle in the dashboard. No code changes, no SDK updates, no configuration files.
When enabled:
- Detection rules still fire — PII detection, secret scanning, and redaction rules run on every request, exactly as they would in standard mode. Your security posture doesn't change.
- Mask-and-restore still works — If you're using reversible redaction, tokens are generated, PII is masked in the outbound request, and original values are restored in the response. The token-to-value mappings exist only in memory and are discarded after the response is delivered.
- Only metadata is logged — HTTP method, status code, response time, rule hit counts, and request identifiers. Enough to monitor system health and detect anomalies. Not enough to reconstruct any request content.
- No request/response bodies on disk — Prompt text, completion text, headers, and URLs are never written to persistent storage. They exist in memory for the duration of the request and are garbage-collected after.
Dashboard → Project Settings → Data Retention → Zero-Retention Mode: ON
That's it. Every request through that project's proxy endpoint is now zero-retention.
Architecture: how in-memory processing works
Zero-retention isn't just "don't log things." It's a deliberate architecture that ensures data never reaches persistent storage:
Request flow:
- Your application sends a request to the Grepture proxy
- The request is received into an in-memory buffer (never written to a disk-backed queue)
- Detection rules run against the in-memory content
- If mask-and-restore is enabled, a token map is generated in memory
- The sanitized request is forwarded to the AI provider over TLS 1.3
- The response is received into an in-memory buffer
- Token restoration runs against the in-memory response
- The restored response is streamed back to your application
- All in-memory buffers are zeroed and released
What's stored:
- Operational metadata:
{ method: "POST", status: 200, latency_ms: 1243, rules_fired: ["email", "phone", "iban"], request_id: "req_abc123" } - That's it.
What's NOT stored:
- Request bodies (prompts, messages, system instructions)
- Response bodies (completions, generated text)
- HTTP headers (including authorization headers)
- URL paths or query parameters
- Token-to-value mappings from mask-and-restore
- Any intermediate processing state
Ephemeral token storage: When mask-and-restore is active, the token map (e.g., [PERSON_1] → "Maria Schmidt") exists only in memory for the duration of the request. For streaming responses, the map is held until the stream completes, then immediately discarded. There is no disk-backed fallback.
Infrastructure: Grepture's proxy runs in Frankfurt (eu-central-1). Even in standard processing mode, data stays in the EU. In zero-retention mode, data doesn't even stay in the proxy — it passes through and is gone.
Compliance mapping
GDPR
| Requirement | How zero-retention helps |
|---|---|
| Data minimization (Art. 5(1)(c)) | No unnecessary data stored — processing only |
| Storage limitation (Art. 5(1)(e)) | Retention period is zero |
| Right to erasure (Art. 17) | No stored data to erase |
| Data protection by design (Art. 25) | Zero-retention is a technical measure implementing minimization by default |
| Breach notification (Art. 33) | Reduced scope — metadata-only breach has lower impact |
| Records of processing (Art. 30) | Metadata audit trail satisfies record-keeping requirements |
EU AI Act
| Requirement | How zero-retention helps |
|---|---|
| Data governance (Art. 10) | Demonstrates controlled, minimized data handling in AI systems |
| Record-keeping (Art. 12) | Metadata logs provide required operational records without storing personal data |
| Transparency (Art. 13) | Audit trail shows what rules fired and when, without exposing data |
For a deeper look at EU AI Act obligations, see the EU AI Act compliance guide for engineers.
HIPAA
| Requirement | How zero-retention helps |
|---|---|
| Minimum necessary (§164.502(b)) | Only metadata retained — PHI is not stored |
| Access controls (§164.312(a)) | No stored PHI to control access to |
| Audit controls (§164.312(b)) | Metadata logs provide audit capability without PHI exposure |
PCI DSS
| Requirement | How zero-retention helps |
|---|---|
| Data retention (Req. 3.1) | Cardholder data is not retained |
| Rendering data unreadable (Req. 3.4) | Not applicable — data is never stored |
| Logging and monitoring (Req. 10) | Metadata logs satisfy monitoring requirements |
Getting started
1. Enable zero-retention in the dashboard
Navigate to your project settings and toggle zero-retention mode. This applies to all requests through that project's proxy endpoint.
2. Verify with the audit log
After enabling, make a few test requests and check the audit log. You should see metadata entries (method, status, latency, rules fired) but no request or response content. If you see content, zero-retention isn't active — check your project settings.
3. Combine with PII redaction
Zero-retention and PII redaction are complementary. Even with zero-retention enabled, you should still run detection rules to catch and redact PII before it reaches the AI provider. Zero-retention prevents your proxy from storing the data. Redaction prevents the AI provider from seeing it. Use both.
For a complete guide on making your AI API calls GDPR-compliant — including data processing agreements, lawful basis, and cross-border transfers — see How to Make AI API Calls GDPR-Compliant.
4. Confirm with your compliance team
Share this with your DPO or compliance officer. Zero-retention is a strong technical measure, but it needs to fit into your broader data protection strategy. Your DPIA should reference it, your processing records should document it, and your privacy policy should reflect it.
Further reading
- How to Make AI API Calls GDPR-Compliant — Lawful basis, data minimization, and practical compliance steps
- EU AI Act compliance guide for engineers — What engineering teams need to do before August 2026
- How Grepture works — Architecture, detection rules, and processing modes
- Grepture docs — SDK reference and configuration guide