Docs›Embeddings
Embeddings
PII-redacting embeddings endpoint. OpenAI-compatible passthrough that strips sensitive data from the input before the vector is generated, so PII never lands in your vector store.
Overview
POST /v1/embeddings is an OpenAI-compatible passthrough that detects and redacts PII in the input before forwarding to OpenAI. The vector that comes back — and that you store in Pinecone, pgvector, Weaviate, or any other vector store — is derived from redacted text, so a PII leak into your vector store becomes structurally impossible.
Unlike chat completions, embeddings are persistent surface area. Once a vector with raw PII lives in your vector store, it cannot be selectively scrubbed, it gets queried by k-NN, and it gets re-injected into prompts via RAG. This endpoint solves that at the ingest side.
Endpoint
POST https://proxy.grepture.com/v1/embeddings
Same request and response shape as OpenAI's /v1/embeddings. You can drop it in anywhere an OpenAI embeddings client is used by pointing the base URL at Grepture.
Authentication
Two keys are involved:
- Grepture API key — pass in
Authorization: Bearer grp_.... Identifies your team for logging and rate limiting. - OpenAI API key — used for the upstream call. Resolved in this order:
- Caller-supplied (BYOK):
x-grepture-auth-forward: Bearer sk-.... The caller pays OpenAI directly. - Stored provider key: if no header is sent, Grepture uses your team's OpenAI key from Integrations → Provider Keys.
- Neither: returns
400 no_openai_key.
- Caller-supplied (BYOK):
Basic usage
curl -X POST https://proxy.grepture.com/v1/embeddings \
-H "Authorization: Bearer grp_live_..." \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-small",
"input": "email me at john.doe@example.com about order #45192"
}'
Response is the standard OpenAI shape, plus two headers:
x-grepture-redactions: 1
x-grepture-pii-categories: email
The embedding vector returned was computed from "email me at [EMAIL_REDACTED] about order #45192" — the placeholder, not the original email.
Request body
| Field | Type | Required | Notes |
|---|---|---|---|
model | string | yes | Any OpenAI embeddings model (text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002). |
input | string or string[] | yes | Pre-tokenized arrays of integers are not supported — see Input shape. |
dimensions | number | no | Forwarded to OpenAI. Lets you truncate text-embedding-3 vectors. |
encoding_format | "float" | "base64" | no | Forwarded to OpenAI. |
user | string | no | Forwarded to OpenAI. |
Headers
| Header | Default | Values |
|---|---|---|
Authorization | — | Bearer grp_... Grepture API key. |
x-grepture-auth-forward | — | Bearer sk-... OpenAI key (BYOK). Optional. |
x-grepture-on-pii | redact | redact | block — see Modes. |
x-grepture-redaction-strategy | placeholder | placeholder | hash | mask — see Redaction strategies. |
x-grepture-trace-id | — | Trace ID for cross-request grouping in the dashboard. |
What gets detected
Two detection layers run on every input, gated by tier:
Regex (all tiers) — email, phone, SSN, credit card, IP address, street address, date of birth.
NER (Pro and above) — person names, locations, organizations. Layered on top of the regex pass; matches are merged and deduped by position.
Each detection has a category and a position span. All distinct categories caught across the request are returned in the x-grepture-pii-categories header.
Redaction strategies
The redaction strategy controls how matches are replaced before the request is forwarded.
placeholder (default, recommended) — replaces matches with stable strings like [EMAIL_REDACTED]. Every email becomes the same token, so two RAG documents about "email delivery problems" still embed to nearly identical vectors. This is the only strategy that preserves k-NN clustering. Use it unless you have a specific reason not to.
hash — replaces matches with a 12-character SHA-256 prefix. Every distinct value gets a distinct token, which breaks similarity-based retrieval — "email user1@x.com" and "email user2@x.com" will end up in different regions of vector space. Useful only if you also store the hash somewhere and need to correlate.
mask — replaces matches with first*last style masks (e.g., j****e@example.com). Partial signal survives but k-NN behavior is unpredictable. Rarely the right choice for embeddings.
Modes
redact (default) — detected PII is replaced and the request is forwarded. Best for RAG workloads where you want maximum recall.
block — if any PII is detected, returns 422 pii_detected and does not forward. Use for regulated workloads where any PII at all should not leave your application:
curl -X POST https://proxy.grepture.com/v1/embeddings \
-H "Authorization: Bearer grp_live_..." \
-H "x-grepture-on-pii: block" \
-H "Content-Type: application/json" \
-d '{"model":"text-embedding-3-small","input":"call me at 555-123-4567"}'
Response:
{
"error": "pii_detected",
"categories": ["phone"],
"count": 1
}
A row is still written to embedding_logs with blocked: true so the dashboard shows what was caught.
Input shape
input accepts either a single string or an array of strings:
{ "input": "one document to embed" }
{ "input": ["doc 1", "doc 2", "doc 3"] }
Pre-tokenized inputs (number[] or number[][] — what OpenAI calls "token arrays") are rejected with 400 tokenized_input_not_supported. The redaction pipeline operates on text; once a string has been tokenized client-side, there is no surface left to detect PII on.
If your client tokenizes by default (rare), switch it to pass strings.
SDK usage
The @grepture/sdk package exposes a typed wrapper:
import { Grepture } from "@grepture/sdk";
const grepture = new Grepture({
apiKey: process.env.GREPTURE_API_KEY!,
proxyUrl: "https://proxy.grepture.com",
});
const { data, redactions } = await grepture.embeddings.create({
model: "text-embedding-3-small",
input: "email me at john@example.com about order #12345",
});
console.log(redactions); // { count: 1, categories: ["email"] }
// `data` is the standard OpenAI embeddings array — pass it to Pinecone/pgvector/Weaviate as-is.
Options on embeddings.create():
| Option | Default | Notes |
|---|---|---|
model | — | Required. |
input | — | Required. String or string array. |
dimensions | — | Optional. Truncate text-embedding-3 vectors. |
encoding_format | — | "float" or "base64". |
user | — | Forwarded to OpenAI. |
onPii | "redact" | "redact" or "block". |
strategy | "placeholder" | "placeholder", "hash", or "mask". |
openaiKey | — | BYOK passthrough. Sent as x-grepture-auth-forward. |
traceId | — | Trace ID for dashboard grouping. |
block mode throws an error with status, code, categories, and count fields attached when PII is detected.
OpenAI SDK drop-in
If you are already using openai, change the base URL and add the Grepture key — no other code changes:
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.GREPTURE_API_KEY!,
baseURL: "https://proxy.grepture.com/v1",
defaultHeaders: {
// Optional: pass your own OpenAI key per-request
"x-grepture-auth-forward": `Bearer ${process.env.OPENAI_API_KEY}`,
},
});
const { data } = await openai.embeddings.create({
model: "text-embedding-3-small",
input: "email me at john@example.com",
});
The x-grepture-redactions and x-grepture-pii-categories headers are still set but you'll need to read them off the raw response if you want them — the typed redactions field is only available via @grepture/sdk.
Errors
| Status | Code | Cause |
|---|---|---|
| 400 | tokenized_input_not_supported | input is a number array. Pass strings. |
| 400 | no_openai_key | No OpenAI key found via header or provider_keys. |
| 400 | — | Missing or invalid model / input. |
| 401 | — | Missing or invalid Grepture API key. |
| 422 | pii_detected | Block mode triggered; request not forwarded. |
| 429 | — | Rate or quota limit hit. |
| 502 | — | Upstream OpenAI unreachable. |
Observability
Every call writes a row to embedding_logs, viewable in the dashboard at Embeddings. The row records:
- Model, input count, total characters, token usage, duration
- Redaction count, categories caught, source (
regex/ai/both) - Redaction strategy, blocked flag, status code
- Trace ID, BYOK flag, provider key ID
Grepture does not store the input text or the response vectors. This is intentional: the point of the endpoint is to keep PII out of vector storage, and storing it on our side would defeat the feature. The dashboard shows counts and categories only.
Background reading
For the why behind this endpoint, see Your Vector Store Is a Permanent PII Leak.