Meet NanoGPT
NanoGPT is a multi-model AI platform: one interface in front of dozens of LLMs across OpenAI, Anthropic, Google, Mistral, and the long tail of open-source models. Users choose the model they want, type a prompt, and NanoGPT routes the request to whichever upstream they picked, with usage settled in crypto.
The problem
Every NanoGPT user request flows out to a third-party model the user picked — that is the product, and it is also the risk surface. Customer records pasted into a support summariser. AWS keys pasted into a code-review tool. A stack trace with internal hostnames pasted into a debug assistant. Each of those payloads leaves NanoGPT's perimeter the moment the user hits send, and lands somewhere NanoGPT cannot audit on the user's behalf.
This shape isn't unique to NanoGPT. Any product that puts users in front of a large language model — a chat UI, an aggregator, an AI coding tool, an internal copilot — has the same problem the model providers themselves don't have: users paste things they shouldn't, to providers you don't control.
NanoGPT's CEO and co-founder, Milan de Reede, summarised the underlying behaviour in their own write-up:
"People paste a lot into AI tools: support tickets, logs, stack traces, emails, internal notes, customer records, code, config files."
And telling users to be careful does not scale:
"Telling everyone to 'just be careful' does not scale. Even careful users miss things."
Why a proxy, not a library
The obvious first instinct is to add a client-side regex. That works for ten patterns, then breaks the moment a user pastes something the regex doesn't know about, or the moment you onboard a second model provider, or the moment marketing asks for a "redaction report."
NanoGPT chose Grepture because we sit at the layer where every prompt has to pass through anyway:
"Built for this exact point in the AI stack: the request path between your app and the model."
The request path runs:
User → NanoGPT → Grepture → Model → Grepture → NanoGPT → User
Two important properties fall out of this:
- Provider-agnostic. Every model NanoGPT supports — OpenAI, Anthropic, Google, plus the long tail — gets the same redaction policy without per-provider code. The aggregator pattern compounds the value: one integration, dozens of upstreams covered.
- Reversible redaction. A naive scrubber replaces "Sarah Chen" with
[NAME]and the model loses the thread of the conversation. Grepture tokenises the value, lets the model reason about the placeholder, and restores the original in the response stream. The user sees their own data back; the model never did.
NanoGPT runs on Grepture's EU infrastructure with zero-data mode — token mappings live in memory for the duration of the request (TTL 1 hour for response restore) and are never written to disk.
Who else has this exact problem
NanoGPT was early to take this seriously, but the underlying pattern repeats across the AI ecosystem:
- BYOK aggregators like NanoGPT itself — users route to many models, none of which they control.
- Enterprise ChatGPT wrappers — companies that put a Grepture-style gateway between employees and the public LLMs they're already using.
- Multi-provider AI products — apps that mix Claude for reasoning, GPT-4 for code, and Gemini for vision, all under one UI.
- Agent orchestration platforms — workflows that fan a single user input out to several model calls, multiplying the leak surface.
If your product has end users typing into a box that eventually reaches a model you don't operate, you face the same risk NanoGPT solved. The fix is the same: put a redaction-aware proxy in front, treat it as infrastructure, not a feature.
Try it
The fastest way to see this in action is the same path NanoGPT took: point your existing model calls at the Grepture proxy, configure the rules you want, and watch the traffic log. We have a free tier for evaluation and a Business plan that mirrors NanoGPT's setup.
Or read NanoGPT's own write-up — they go deeper on the failure modes they were watching for, and what convinced them to offer redaction as a first-class opt-in for users who handle sensitive prompts.