|Ben @ Grepture

From PII Redaction to AI Gateway — Why We're Expanding Grepture

We started by catching sensitive data before it reaches an LLM. Now we're building a unified AI gateway with prompt management, tracing, a browser extension, and a CLI — here's why.

It started with a proxy

Grepture began with a straightforward idea: sit between your app and the LLM, catch sensitive data before it leaves your network. PII redaction at request-time. No SDK callbacks, no post-hoc scanning — just a reverse proxy that inspects every prompt and response as it flows through.

That core still exists, and it's still important. But once you're in the hot path of every AI call, you start to see something bigger. Every prompt, every response, every token is already flowing through you. The question stopped being "what else should we block?" and became "what else can we do here?"

The answer, it turns out, is a lot.

Why being in the hot path matters

There's a growing ecosystem of AI observability tools — Langfuse, Helicone, LangSmith, and others. They're good at what they do: giving you visibility into your AI pipeline after requests have been made. If you need tracing, evaluation, and prompt analytics, they have you covered.

Grepture takes a different approach. We're not watching from the sidelines — we're in the request path itself. That means we can do everything an observability tool does (log, trace, measure), but we can also act. Redact a credit card number before it hits OpenAI. Block a request that triggers a prompt injection rule. Serve a managed prompt instead of the one hardcoded in your repo. Route traffic to a different model based on content.

This isn't better or worse than pure observability — it's a different architectural choice. But we think it's the right foundation for a unified AI gateway, because observation alone doesn't give you control.

What's new

We've been shipping. Here's what Grepture looks like today beyond PII redaction.

Prompt management

Prompts shouldn't live as strings scattered across your codebase. With Grepture, you can store, version, and serve prompts through the dashboard. Update a system prompt without redeploying. Roll back when something breaks. See which version of a prompt generated which responses.

This turns prompts into a managed resource — something your team can iterate on without a deploy cycle and without giving everyone access to the repo.

Tracing

Every request that flows through Grepture is logged with full context: the prompt sent, the response received, token counts, latency, cost, and which model handled it. Link multi-turn conversations with trace IDs. Filter by model, by cost, by time range.

This is the kind of visibility you'd normally set up a separate observability tool for. Because we're already in the path, you get it for free — no extra SDK, no additional integration.

Browser extension — fighting shadow AI

Here's a problem no proxy can solve on its own: employees pasting sensitive data into ChatGPT, Claude, and other AI chat interfaces directly in the browser. No API call, no proxy in the middle, no audit trail.

We built Grepture Browse — a Chrome extension that detects PII and secrets in real-time as users type into AI chat inputs. It flags what it finds and lets users redact before sending. On the free tier, it uses local regex patterns. On Pro, it connects to your team's Grepture rules, so the same policies that protect your API traffic also protect browser-based AI usage.

Shadow AI is one of the hardest problems in enterprise AI adoption. We think meeting it at the browser level — where the data actually enters these tools — is the right approach.

CLI (coming soon)

We're building a CLI that lets developers proxy and inspect AI traffic through Grepture from their local environment. Think of it as your local AI dev tool: see exactly what your LLM receives, test rules against real traffic, debug prompts — all from the terminal.

The goal is to make Grepture part of the developer workflow, not just the production infrastructure. We want the same visibility and control you get in production available on your laptop while you're building.

Why "unified AI gateway" makes sense

AI usage is fragmented. Your backend calls OpenAI. Your RAG pipeline hits Anthropic. Your support team uses ChatGPT in the browser. Your data team experiments with Gemini locally. Every touchpoint is a different integration, a different set of logs, a different blind spot.

We think there should be one layer that sees and controls all of it. One place to manage prompts, enforce policies, trace conversations, and track costs — regardless of which model or which interface is being used.

That's what we mean by AI gateway. Not a fancy load balancer, but a genuine control plane for how your organisation uses AI.

Open source at the core

Grepture's proxy is open source and self-hostable. We believe the thing sitting between your app and your LLM should be auditable and transparent — you shouldn't have to trust a black box with every prompt you send.

We're looking to expand what's open source over time. The core proxy, the detection engine, and the SDK are available today. As the product grows, we want more of it to be open.

What we're building toward

We're not trying to be everything to everyone. We're building a product developers actually want to use — not a compliance checkbox you install and forget. The security and data protection that Grepture started with is still there, and it's still a core part of the product. But now it's one capability inside something bigger.

If you're building with LLMs, we'd love for you to give Grepture a try. Drop in the SDK, point your API calls through the proxy, and see what your AI traffic actually looks like. It takes about five minutes.