DocsEmbeddings

Embeddings

PII-redacting embeddings endpoint. OpenAI-compatible passthrough that strips sensitive data from the input before the vector is generated, so PII never lands in your vector store.

Overview

POST /v1/embeddings is an OpenAI-compatible passthrough that detects and redacts PII in the input before forwarding to OpenAI. The vector that comes back — and that you store in Pinecone, pgvector, Weaviate, or any other vector store — is derived from redacted text, so a PII leak into your vector store becomes structurally impossible.

Unlike chat completions, embeddings are persistent surface area. Once a vector with raw PII lives in your vector store, it cannot be selectively scrubbed, it gets queried by k-NN, and it gets re-injected into prompts via RAG. This endpoint solves that at the ingest side.

Endpoint

POST https://proxy.grepture.com/v1/embeddings

Same request and response shape as OpenAI's /v1/embeddings. You can drop it in anywhere an OpenAI embeddings client is used by pointing the base URL at Grepture.

Authentication

Two keys are involved:

  • Grepture API key — pass in Authorization: Bearer grp_.... Identifies your team for logging and rate limiting.
  • OpenAI API key — used for the upstream call. Resolved in this order:
    1. Caller-supplied (BYOK): x-grepture-auth-forward: Bearer sk-.... The caller pays OpenAI directly.
    2. Stored provider key: if no header is sent, Grepture uses your team's OpenAI key from Integrations → Provider Keys.
    3. Neither: returns 400 no_openai_key.

Basic usage

curl -X POST https://proxy.grepture.com/v1/embeddings \
  -H "Authorization: Bearer grp_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "email me at john.doe@example.com about order #45192"
  }'

Response is the standard OpenAI shape, plus two headers:

x-grepture-redactions: 1
x-grepture-pii-categories: email

The embedding vector returned was computed from "email me at [EMAIL_REDACTED] about order #45192" — the placeholder, not the original email.

Request body

FieldTypeRequiredNotes
modelstringyesAny OpenAI embeddings model (text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002).
inputstring or string[]yesPre-tokenized arrays of integers are not supported — see Input shape.
dimensionsnumbernoForwarded to OpenAI. Lets you truncate text-embedding-3 vectors.
encoding_format"float" | "base64"noForwarded to OpenAI.
userstringnoForwarded to OpenAI.

Headers

HeaderDefaultValues
AuthorizationBearer grp_... Grepture API key.
x-grepture-auth-forwardBearer sk-... OpenAI key (BYOK). Optional.
x-grepture-on-piiredactredact | block — see Modes.
x-grepture-redaction-strategyplaceholderplaceholder | hash | mask — see Redaction strategies.
x-grepture-trace-idTrace ID for cross-request grouping in the dashboard.

What gets detected

Two detection layers run on every input, gated by tier:

Regex (all tiers) — email, phone, SSN, credit card, IP address, street address, date of birth.

NER (Pro and above) — person names, locations, organizations. Layered on top of the regex pass; matches are merged and deduped by position.

Each detection has a category and a position span. All distinct categories caught across the request are returned in the x-grepture-pii-categories header.

Redaction strategies

The redaction strategy controls how matches are replaced before the request is forwarded.

placeholder (default, recommended) — replaces matches with stable strings like [EMAIL_REDACTED]. Every email becomes the same token, so two RAG documents about "email delivery problems" still embed to nearly identical vectors. This is the only strategy that preserves k-NN clustering. Use it unless you have a specific reason not to.

hash — replaces matches with a 12-character SHA-256 prefix. Every distinct value gets a distinct token, which breaks similarity-based retrieval — "email user1@x.com" and "email user2@x.com" will end up in different regions of vector space. Useful only if you also store the hash somewhere and need to correlate.

mask — replaces matches with first*last style masks (e.g., j****e@example.com). Partial signal survives but k-NN behavior is unpredictable. Rarely the right choice for embeddings.

Modes

redact (default) — detected PII is replaced and the request is forwarded. Best for RAG workloads where you want maximum recall.

block — if any PII is detected, returns 422 pii_detected and does not forward. Use for regulated workloads where any PII at all should not leave your application:

curl -X POST https://proxy.grepture.com/v1/embeddings \
  -H "Authorization: Bearer grp_live_..." \
  -H "x-grepture-on-pii: block" \
  -H "Content-Type: application/json" \
  -d '{"model":"text-embedding-3-small","input":"call me at 555-123-4567"}'

Response:

{
  "error": "pii_detected",
  "categories": ["phone"],
  "count": 1
}

A row is still written to embedding_logs with blocked: true so the dashboard shows what was caught.

Input shape

input accepts either a single string or an array of strings:

{ "input": "one document to embed" }
{ "input": ["doc 1", "doc 2", "doc 3"] }

Pre-tokenized inputs (number[] or number[][] — what OpenAI calls "token arrays") are rejected with 400 tokenized_input_not_supported. The redaction pipeline operates on text; once a string has been tokenized client-side, there is no surface left to detect PII on.

If your client tokenizes by default (rare), switch it to pass strings.

SDK usage

The @grepture/sdk package exposes a typed wrapper:

import { Grepture } from "@grepture/sdk";

const grepture = new Grepture({
  apiKey: process.env.GREPTURE_API_KEY!,
  proxyUrl: "https://proxy.grepture.com",
});

const { data, redactions } = await grepture.embeddings.create({
  model: "text-embedding-3-small",
  input: "email me at john@example.com about order #12345",
});

console.log(redactions); // { count: 1, categories: ["email"] }

// `data` is the standard OpenAI embeddings array — pass it to Pinecone/pgvector/Weaviate as-is.

Options on embeddings.create():

OptionDefaultNotes
modelRequired.
inputRequired. String or string array.
dimensionsOptional. Truncate text-embedding-3 vectors.
encoding_format"float" or "base64".
userForwarded to OpenAI.
onPii"redact""redact" or "block".
strategy"placeholder""placeholder", "hash", or "mask".
openaiKeyBYOK passthrough. Sent as x-grepture-auth-forward.
traceIdTrace ID for dashboard grouping.

block mode throws an error with status, code, categories, and count fields attached when PII is detected.

OpenAI SDK drop-in

If you are already using openai, change the base URL and add the Grepture key — no other code changes:

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.GREPTURE_API_KEY!,
  baseURL: "https://proxy.grepture.com/v1",
  defaultHeaders: {
    // Optional: pass your own OpenAI key per-request
    "x-grepture-auth-forward": `Bearer ${process.env.OPENAI_API_KEY}`,
  },
});

const { data } = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: "email me at john@example.com",
});

The x-grepture-redactions and x-grepture-pii-categories headers are still set but you'll need to read them off the raw response if you want them — the typed redactions field is only available via @grepture/sdk.

Errors

StatusCodeCause
400tokenized_input_not_supportedinput is a number array. Pass strings.
400no_openai_keyNo OpenAI key found via header or provider_keys.
400Missing or invalid model / input.
401Missing or invalid Grepture API key.
422pii_detectedBlock mode triggered; request not forwarded.
429Rate or quota limit hit.
502Upstream OpenAI unreachable.

Observability

Every call writes a row to embedding_logs, viewable in the dashboard at Embeddings. The row records:

  • Model, input count, total characters, token usage, duration
  • Redaction count, categories caught, source (regex / ai / both)
  • Redaction strategy, blocked flag, status code
  • Trace ID, BYOK flag, provider key ID

Grepture does not store the input text or the response vectors. This is intentional: the point of the endpoint is to keep PII out of vector storage, and storing it on our side would defeat the feature. The dashboard shows counts and categories only.

Background reading

For the why behind this endpoint, see Your Vector Store Is a Permanent PII Leak.