How We Redact PII Before It Reaches the LLM

Redaction is only as good as detection

Masking PII in an LLM prompt is the easy part. Once you know a span of text is a name or a card number, replacing it is trivial. The hard part — the part that decides whether you actually have data protection or just the appearance of it — is finding the PII in the first place. Miss one address and it lands in your provider's logs all the same.

So this post is about detection: how Grepture decides what counts as PII before a request reaches OpenAI, Anthropic, or any other provider. There's no single magic model behind it. It's a layered stack — deterministic validators for the things that have structure, NER models for the things that don't — and, importantly, all of it runs in-process at the proxy rather than being farmed out to some third-party detection API. Here's how it fits together.

Layer one: deterministic detection for structured PII

A lot of PII has structure you can verify, not just guess at. Emails have a grammar. Credit-card numbers have a Luhn checksum. IBANs have a mod-97 check. SSNs, phone numbers, IP addresses, dates of birth — all have predictable shapes.

For these, an ML model is the wrong tool. You don't want a probabilistic classifier assigning 87% confidence to something a regular expression can confirm with certainty. So the first layer is deterministic: a set of pattern matchers covering the structured categories — email, phone, SSN, credit card, IP address, postal address, date of birth. A match is a match.

This layer is also what runs in the open-source core of the proxy, with no model downloads required. When matches overlap, they're sorted by position and de-duplicated with a simple, predictable rule — earliest start wins, and on a tie the longer span wins — so you never get half-redacted values or nested replacements corrupting the payload.

The replacement itself walks only the string values of the parsed JSON body, never the structural tokens. That detail matters more than it sounds: redacting naively across a raw JSON string can clobber keys, numbers, and punctuation and produce a body the provider rejects. Detecting inside decoded string leaves keeps the document valid.

Where regex stops: names

Structured patterns cover a lot, but they fall apart on the single most common category of PII: names. There's no regex for "this token is a person's name." We tried — and the design note we left in the code is blunt about it: regex-based name detection is too unreliable to ship. A pattern that catches "John Smith" also catches "New York," "Monday Morning," and half the capitalized words in any document. Precision collapses.

Names, organizations, and locations are unstructured PII. They're defined by context, not shape. That's exactly the problem statistical models were built for — and it's why the second layer exists.

Layer two: NER models for the unstructured kind

For the unstructured categories, Grepture's pro detection runs transformer-based Named Entity Recognition models. They classify text token by token, tagging each one with labels like B-PER (beginning of a person), I-LOC (inside a location), or O (not an entity). We map those entity labels onto our PII categories — person, location, organization — and turn the runs of tagged tokens into clean spans.

Two implementation details do most of the work here:

Subword reassembly. Transformer tokenizers split words into subword pieces — Hart + ##mann — and tag each piece. The detector stitches these back together (Hartmann), and joins consecutive entity tokens into whole names (Michael Hartmann) rather than emitting fragments.
Span resolution. Once an entity is identified, it has to map back to exact character offsets in the original text so the masker replaces precisely the right substring. Where the tokenizer provides offsets we use them directly; where it doesn't, we resolve the position by locating the entity text in the source, advancing a cursor so repeated names don't collide.

Our NER models cover multiple languages, which is not a nice-to-have for a European product — it's the requirement. A detector tuned only for English loses recall the moment a prompt is in German, and an EU gateway sees German, French, and Dutch as a matter of course. Multilingual coverage is what lets the same pipeline catch a name in a German support ticket as reliably as an English one.

We deliberately don't publish the exact model lineup or the quantization and threshold settings behind them. Those are tuned against real traffic and they're where a lot of the accuracy lives. The architecture, though, is the point: structured PII handled deterministically, unstructured PII handled by a model built for it.

More than PII: a model per job

The NER models aren't alone. Grepture's detection layer loads a small suite of specialized models, each doing one job well rather than one model trying to do everything:

NER models for personal entities (names, locations, organizations).
Prompt-injection detection — a classifier that flags attempts to hijack the model's instructions.
Toxicity detection — a classifier for abusive or harmful content.
Zero-shot classification — a flexible classifier for routing and content categorization where labels are defined at request time.

Each is a separate, purpose-built model for a separate problem, loaded once and reused across requests. PII redaction uses the deterministic layer plus the NER models; the others sit alongside for the security checks that aren't about personal data at all.

Why in-process matters

Here's the part that ties the engineering back to compliance. Every one of these models runs inside the proxy process — loaded with quantized ONNX weights and executed locally, not called over the network to a hosted detection service.

That ordering is the whole point of redaction. If your "PII detection" step is an API call to a third-party vendor, you've handed that vendor the raw, un-redacted text before it gets cleaned. You haven't shrunk your data-protection surface; you've added a processor to it. Under GDPR, that's another subprocessor to contract, document, and justify in your records of processing.

Running detection in-process avoids that entirely — and it holds true in our cloud as much as in a self-hosted deployment:

Fewer subprocessors. No prompt text is sent to a separate detection vendor. The data the model sees never leaves the proxy that's already handling the request.
Detection before egress (Art. 32). Redaction happens before the request is forwarded upstream, which is the only ordering that keeps raw PII out of the provider call. Post-hoc redaction has already exposed the data.
A cleaner lineage story. With the EU AI Act fully applicable from August 2, 2026, "detection runs in the gateway, no data fan-out" is a far simpler thing to put in front of an auditor than a map of detection subprocessors.

If your requirements go further — data must never leave your own infrastructure at all — the proxy is open source and self-hostable, so you can run the whole pipeline, models included, on your own machines. That's an option for teams that need it, not a precondition for the privacy guarantees above. The in-process design already keeps detection out of third-party hands whether you run it yourself or use Grepture's cloud.

Once detection produces a clean set of spans, masking takes over — and if you want the model's answer to come back with the real values restored, that's the mask-and-restore flow. For the broader detection picture and how to choose categories, see PII detection best practices and our rundown of the best open-source PII models.

Key takeaways

Detection is the hard part of redaction, and Grepture splits it into two layers: deterministic validators for structured PII, transformer NER models for the unstructured kind.
Structured PII (emails, cards, IBANs, SSNs) is matched deterministically — no ML guessing at something a checksum can prove — with predictable overlap resolution that keeps JSON valid.
Names, organizations, and locations are detected by NER models, with subword reassembly and precise offset resolution; multilingual coverage is essential for an EU product.
The "multiple models" are task-specialized, not redundant — separate models for PII, prompt injection, toxicity, and classification, each loaded once and reused.
All detection runs in-process at the proxy, so raw text is never sent to a third-party detection API — true in Grepture's cloud, with self-hosting available when data must stay entirely on your own infrastructure.