OpenAI shipped a redaction model — and open-sourced it
On April 22, 2026, OpenAI released Privacy Filter, a small open-weight model purpose-built for detecting personally identifiable information in unstructured text. It's licensed Apache 2.0 and published on Hugging Face, which means you can pull the weights, fine-tune it, and run it on your own hardware — including on a developer's laptop.
That last part is the interesting one. OpenAI is the company most associated with "send your data to our API," and the headline framing in their announcement is the opposite: they want you to scrub PII before it ever reaches a model. The intake risk — users pasting logs, emails, or contracts into ChatGPT — is the thing they're trying to defuse.
So what is the model actually good at, and where does it sit relative to the open-source PII landscape that already existed? Let's dig in.
What's in the box
Privacy Filter is a sparse mixture-of-experts (MoE) model with 1.5B total parameters but only 50M active parameters per token, thanks to a 128-expert feed-forward design with top-4 routing. The aggressive sparsity is deliberate: at inference time, you're effectively running a 50M-parameter model, which is small enough to run in a browser tab or on a CPU.
It detects eight categories of sensitive spans:
| Label | What it catches |
|---|---|
private_person | Names of real people |
private_email | Email addresses |
private_phone | Phone numbers |
private_address | Physical addresses |
private_url | URLs that identify someone |
private_date | Dates of birth, sensitive dates |
account_number | Bank, credit card, account IDs |
secret | API keys, passwords, tokens |
That last category — secret — is unusual. Most PII models stop at "personal information." OpenAI explicitly built credential detection into the same model, which mirrors a pattern we've seen across production AI gateways: developers paste code into LLMs and credentials slip through.
The benchmark numbers
On the public PII-Masking-300k benchmark, Privacy Filter posts:
- F1: 96.0% (94.04% precision / 98.04% recall)
- On a corrected version of the same benchmark, F1: 97.43%
For reference, a fine-tuned DeBERTa v3 on the same dataset hits F1 ~97.6% — meaning Privacy Filter is roughly tied with the best fixed-architecture model the open-source community has produced, while being substantially smaller at inference time. That's a real result, not a press release.
Why "run it locally" matters
The interesting design decision is the throughput target. OpenAI explicitly framed Privacy Filter as a "high-throughput privacy workflow" tool — meant to scrub data before it hits another model, not to be the model itself.
There are three places this kind of thing actually gets used:
- At ingestion time — before logs, support tickets, or documents are stored or indexed.
- In a proxy in front of LLM calls — strip PII from the prompt on its way to OpenAI / Anthropic / a self-hosted model.
- At the edge / in the browser — sanitize before the data even leaves the user's device.
A 50M-active-param model with 96% F1 makes (3) realistic for the first time. You don't need a GPU pool to redact a paragraph of text. You can run this in a service worker.
That's a meaningful shift. The previous open-source state of the art for in-browser PII detection was much smaller, weaker models — or shipping a full DeBERTa to the client, which is too heavy for most apps.
How it compares to what already existed
For a deeper tour of the existing landscape, see our comparison of open-source PII redaction models. The short version, with Privacy Filter dropped in:
| Model | Approach | Entity Types | F1 | Footprint |
|---|---|---|---|---|
| OpenAI Privacy Filter | MoE token classifier | 8 | 0.96 – 0.97 | 50M active params |
| DeBERTa v3 + ai4privacy | Token classification | 54 | 0.9757 | ~180M params |
| GLiNER-PII (Knowledgator) | Zero-shot NER | 60+ | ~0.81 | ~160M params |
| Piiranha (mDeBERTa) | Token classification | 17 | ~0.98 (token) | ~280M params |
| Presidio + spaCy | Framework + NER | Configurable | Varies | Varies |
A few honest observations:
- Entity coverage is narrower. Eight categories vs 54+ for ai4privacy DeBERTa or 60+ for GLiNER. If you need fine-grained labels (e.g. distinguishing
IBANfromcredit_cardfromrouting_number), Privacy Filter collapses those intoaccount_numberand you'll have to layer regex or another model on top. - It's not zero-shot. Unlike GLiNER, you can't ask it for a custom entity type at inference. The label set is baked in.
- It's English-first. OpenAI's model card focuses on English performance. Multilingual use is possible but uncharacterized — Piiranha is still the safer bet for cross-language coverage.
- The accuracy/footprint tradeoff is the headline. For the size class, F1 of 96-97% is excellent. The MoE architecture is doing real work here.
Where it fits in a real pipeline
The honest framing is that Privacy Filter is a really good detection model, not a complete redaction system. A production pipeline still needs:
- Regex for structured PII — credit card patterns, SSNs, common credential formats. These are sub-millisecond, deterministic, and auditable. Don't replace them; run them first.
- The ML model for unstructured PII — names, addresses, contextual entities. This is where Privacy Filter, DeBERTa, or GLiNER actually earn their keep.
- Reversible mapping — most teams don't actually want to destroy PII. They want to mask it before it leaves their environment, get a useful response from an LLM, then restore the original values when the response comes back. Privacy Filter detects the spans; you still need to maintain a per-request mapping to put them back.
- Observability — what got redacted, on which request, by which rule. Without an audit trail, you can't answer the compliance questions that drove you to do this in the first place.
If you're building this from scratch, Privacy Filter is a strong default for step 2. If you already have GLiNER or a DeBERTa fine-tune in production, the question is mostly about footprint: can you drop a 180M-param model and deploy a 50M-active-param one in the same place?
What this does and doesn't change
This release validates the architecture more than the market. The shape — small open-weight model, run it locally, scrub before sending to a frontier model — is exactly the architecture Grepture and several other projects have been pushing for the last couple of years. OpenAI shipping it under their own brand is a useful tailwind: it makes the case to security teams who used to ask "why are we redacting at all?"
What it doesn't do is make the rest of the problem disappear. Detecting PII is the easy half. The hard half is everything around it: the regex layer for structured patterns, the policy layer that decides what to do with each detection, the reversible mapping so AI responses stay useful, the audit trail your compliance team will eventually ask for, and the secret-scanning rules that catch credentials Privacy Filter's secret label might miss.
A model is a component. A pipeline is a product.
How Grepture helps
Grepture is the pipeline. We sit between your application and the LLM provider, run regex + ML detection on every request, support reversible masking so responses stay personalized, and log every redaction decision for audit. You get the architecture Privacy Filter is built for, without having to host the model, stitch the layers together, or build the policy engine yourself.
If you're evaluating Privacy Filter and thinking through where to deploy it, that's the right instinct — and the gateway is usually the answer. We can also run Privacy Filter as the ML detection backend if that's the model you want to standardize on.