How to Redact PII from Vercel AI SDK Calls

Stop sending names, emails, and secrets through the Vercel AI SDK. Learn how to redact PII from every LLM call using a proxy-level security layer — no code changes required.

The problem: PII leaking through Vercel AI SDK calls

The Vercel AI SDK (ai) makes it easy to build AI-powered features with generateText, streamText, and generateObject. But every call sends your prompt to an LLM provider. If that prompt includes user data — form submissions, chat messages, database records — it carries PII to the provider's servers.

import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";

const result = await generateText({
  model: openai("gpt-4o"),
  prompt: `Draft a response to this customer complaint:

    From: Amanda Foster
    Email: a.foster@megacorp.com
    Phone: (212) 555-0176
    Account: 4539-1488-0343-6467
    SSN: 613-44-2289
    Address: 350 5th Ave, New York, NY 10118
    Issue: My API key ghp_xK4mSecret123 was exposed in your logs.`,
});

That single call sent a name, email, phone number, credit card number, SSN, address, and a GitHub token to OpenAI. The Vercel AI SDK is provider-agnostic — the same risk applies whether you're using OpenAI, Anthropic, Google, or any other provider.

Why the AI SDK's flexibility increases risk

The Vercel AI SDK is designed for rapid iteration. That speed comes with PII risk:

  • generateText and streamText send freeform prompts directly to providers — any user data in the prompt goes with it
  • generateObject enforces output schemas but doesn't control what's in the input
  • Tool calling can pull data from APIs, databases, and user sessions — assembling PII from multiple sources
  • Multi-step agents (maxSteps) accumulate context across steps, increasing PII exposure with each iteration
  • Streaming to the client via useChat or useCompletion means PII in responses reaches the browser

The SDK makes it trivially easy to build features that process user data. That's its strength — but it means every call is a potential PII vector.

The solution: proxy-level redaction with Grepture

Grepture is an open-source security proxy that sits between your AI SDK calls and any LLM provider. Every request is scanned for PII, secrets, and sensitive patterns before it leaves your infrastructure. Sensitive data is masked with reversible tokens — and restored in the response so your application works normally.

One proxy protects every provider in your stack. Your code barely changes.

Setup in 3 minutes

1. Install the SDK

npm install @grepture/sdk

2. Get your API key

Sign up at grepture.com/en/pricing — the free plan includes 1,000 requests/month. Copy your API key from the dashboard.

3. Wrap your AI SDK provider

The Vercel AI SDK providers accept custom baseURL, headers, and fetch options. Use clientOptions() to route traffic through Grepture:

import { generateText } from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { Grepture } from "@grepture/sdk";

const grepture = new Grepture({
  apiKey: process.env.GREPTURE_API_KEY!,
  proxyUrl: "https://proxy.grepture.com",
});

const opts = grepture.clientOptions({
  apiKey: process.env.OPENAI_API_KEY!,
  baseURL: "https://api.openai.com/v1",
});

const openai = createOpenAI({
  baseURL: opts.baseURL,
  fetch: opts.fetch,
});

// Every generateText, streamText, and generateObject call is protected
const result = await generateText({
  model: openai("gpt-4o"),
  prompt: userInput,
});

Works with any AI SDK provider

The same pattern works with Anthropic, Google, and any OpenAI-compatible provider:

import { createAnthropic } from "@ai-sdk/anthropic";

const opts = grepture.clientOptions({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  baseURL: "https://api.anthropic.com",
});

const anthropic = createAnthropic({
  baseURL: opts.baseURL,
  fetch: opts.fetch,
});

const result = await generateText({
  model: anthropic("claude-sonnet-4-5-20250929"),
  prompt: userInput,
});

What gets detected

Grepture ships with 50+ detection patterns on the free tier and 80+ on Pro, covering:

CategoryExamplesTier
Personal identifiersNames, emails, phone numbers, SSNs, dates of birthFree (regex), Pro (AI)
Financial dataCredit card numbers, IBANs, routing numbersFree
CredentialsAPI keys, bearer tokens, passwords, connection stringsFree
Network identifiersIP addresses, MAC addressesFree
Freeform PIINames, organizations, and addresses in unstructured textPro (local AI models)
Adversarial inputsPrompt injection attemptsBusiness

All detection runs on Grepture infrastructure — no data is forwarded to additional third parties.

Mask and restore: reversible redaction

Grepture doesn't just strip PII — it replaces sensitive values with tokens, sends the sanitized prompt to the provider, and restores the original values in the response.

What the LLM sees:

Draft a response to this customer complaint:
From: [PERSON_1]
Email: [EMAIL_1]
Phone: [PHONE_1]
Account: [CREDIT_CARD_1]
Issue: My API key [SECRET_1] was exposed in your logs.

What your app gets back:

Dear Amanda Foster, thank you for reporting that your
GitHub token (ghp_xK4mSecret123) was exposed. We've
rotated the credentials and can confirm no unauthorized
access occurred on your account ending in 6467.

The model processes clean data. Your application receives the full, personalized response. No PII ever reaches the LLM provider.

Streaming support

Grepture handles the AI SDK's streaming natively. streamText and useChat work without modification — the proxy detokenizes chunks in real time.

import { streamText } from "ai";

const result = streamText({
  model: openai("gpt-4o"),
  prompt: userInput,
});

for await (const chunk of result.textStream) {
  // Tokens are restored in real time
  process.stdout.write(chunk);
}

This also works seamlessly with Next.js API routes and the useChat hook on the client — the streamed response is detokenized before it reaches the browser.

Next steps