PII Redaction for Embeddings — a First-Class Endpoint

POST /v1/embeddings is now a first-class endpoint, separate from the catchall proxy and parallel to Traffic Log in the dashboard.

What it does — OpenAI-compatible passthrough that detects PII in the input field, replaces matches with stable placeholders (default), and forwards the redacted string to OpenAI. The vector that comes back, and that you store in Pinecone or pgvector, is derived from clean text. PII never reaches the vector store in the first place.

Why we built it separately — Embeddings have different physics from chat completions: they persist, they're queryable, they're load-bearing for RAG. Mixing them into traffic_logs would have flooded the chat-debugging view and forced storage-shape compromises (no point storing 50KB float arrays or the input text we just spent effort keeping out of the vector store).

Two modes — redact (default) replaces PII with placeholders so k-NN clustering still works; block (via x-grepture-on-pii: block) returns 422 if any PII is detected, for regulated workloads.

Free tier — Regex detection (email, phone, SSN, credit card, IP, address, DOB) ships free. NER detection for names, locations, and organizations is layered on top for Pro and above.

Where it lives — Dashboard at Embeddings (parallel to Traffic Log). Docs at /docs/embeddings. Use case writeup at /use-cases#pii-redaction-for-embeddings.

Background on why vector stores are a permanent PII leak — and why embedding-time redaction with stable placeholders is the right shape — is in Your Vector Store Is a Permanent PII Leak.