Never Go Down Because Your Provider Did

Provider keys in Grepture

Grepture now lets you store your LLM provider API keys directly in the dashboard. OpenAI, Anthropic, Gemini, Azure — add as many keys as you need, for as many providers as you use.

Each key is encrypted with AES-256-GCM before storage. Only the last four characters are visible in the dashboard. The proxy decrypts keys in memory at request time — plaintext never hits the cache or logs.

Once your keys are stored, the proxy uses them automatically. No X-Grepture-Auth-Forward header needed. Your SDK setup stays the same — the proxy resolves the right key for the right provider.

Same-provider fallback

Add multiple keys for the same provider. Set one as primary. Link the rest as fallbacks.

When the primary key gets rate-limited (429), the proxy automatically tries the next key. When a key returns an auth error (401) — maybe it was revoked or expired — the proxy moves on. Same for timeouts (408) and server errors (5xx).

This is useful for teams that distribute API usage across multiple keys to stay under rate limits, or teams that keep backup keys for redundancy.

Cross-provider fallback

This is where it gets interesting. A fallback key doesn't have to be the same provider.

Your OpenAI key can fall back to an Anthropic key. If OpenAI is down entirely — every key returning 5xx — the proxy automatically switches to Anthropic. It translates the request from OpenAI's chat completions format to Anthropic's messages format, sends it, and translates the response back.

Your application receives an OpenAI-shaped response. It never knows the request was served by Anthropic.

Cross-provider fallback requires a default_model on the fallback key (so the proxy knows which model to target), and it handles tool calls and standard message formats. Image inputs skip cross-provider translation since formats differ too much to reliably convert.

What triggers fallback

Not every error should trigger a retry. Grepture is selective:

401 — Key revoked or invalid. Another key might work.
408 — Upstream timeout. Worth retrying.
429 — Rate limited. A different key has its own quota.
5xx — Server error. The provider is having issues.

These trigger fallback. But:

400 — Bad request. Your payload is malformed. Retrying with a different key won't fix it.
403 — Content policy violation. Key-specific, won't help to switch.
404 — Model or endpoint not found.

If the error is your bug, the proxy tells you immediately instead of wasting time on retries.

Streaming caveat

One important detail: once a streaming response has started sending data to your application (headers sent, chunks flowing), the proxy can't retry. The connection is committed.

Fallback works for errors that happen before streaming starts — the upstream returns a buffered error response, and the proxy retries transparently. This covers the vast majority of cases: auth errors, rate limits, and server errors all return before streaming begins.

Full trace visibility

Every fallback attempt shows up in the dashboard. You see the full chain: which key was tried, what error came back, which provider ultimately served the request.

This means you can answer questions like: "How often does our OpenAI primary key get rate-limited?" and "What percentage of requests are served by our Anthropic fallback?"

The provider_key_id is logged on every request, so you get per-key analytics out of the box.

Works with zero-data mode

Provider fallback is fully compatible with zero-data mode. The proxy reads request content in memory to translate between provider formats, but nothing is written to disk. You get the same fallback resilience with the same privacy guarantees.

Getting started

Go to Settings > API > Provider Keys in the Grepture dashboard
Add your provider keys (OpenAI, Anthropic, Gemini, Azure)
Set fallback chains — link keys in the order you want them tried
That's it. The proxy handles the rest.

Your existing SDK integration doesn't change. The proxy resolves keys and manages fallback transparently.

Check the Routing product page for more details, or read the configuration docs for the full setup guide.