DocsPrompt Management

Prompt Management

Store, version, and resolve prompt templates server-side. Handlebars templating, draft/publish workflow, and header-based or SDK-based resolution.

Overview

Prompt management lets you store prompt templates in Grepture and resolve them at request time. Instead of hardcoding prompts in your application, you define them in the dashboard with Handlebars-style variables, version them with a draft/publish workflow, and resolve them either proxy-side (via headers) or SDK-side (via grepture.prompt.use()).

  • Versioning — edit a draft freely, publish immutable versions (v1, v2, v3...), activate any version as "live", roll back anytime
  • Templating{{variable}} interpolation, {{#if}} conditionals, {{#each}} loops
  • Resolution — proxy replaces request messages with the resolved template before forwarding to the LLM
  • Rule integration — prompts flow through your rule pipeline by default. Set skip_rules to bypass it for system-only prompts.

Creating a prompt

In the dashboard, go to Prompts and click New Prompt. Each prompt has:

FieldDescription
nameDisplay name (e.g., "Support Reply")
slugURL-safe identifier used in API calls (e.g., support-reply)
skip_rulesWhen enabled, the prompt bypasses the guardrail rule pipeline

After creating, you land in the editor where you define messages and variables.

Messages and variables

Each prompt contains an array of messages (system, user, assistant) with Handlebars-style templates:

You are a {{tone}} support agent for {{company}}.

{{#if context}}
Here is the relevant context:
{{context}}
{{/if}}

Please respond to the following issue:
{{issue}}

Define variables in the Variables panel to document what your template expects:

FieldDescription
nameVariable name (matches {{name}} in templates)
typestring, number, or boolean
defaultOptional fallback value

Versioning workflow

  1. Draft — edit freely, test with the test panel, save as many times as you want
  2. Publish — snapshots the current draft into an immutable version (v1, v2, v3...)
  3. Activate — set which published version is "live" (resolved when no version is specified)
  4. Rollback — activate any older version to roll back instantly

The draft is never served to production traffic unless explicitly requested with @draft.

Using prompts via headers

Works with any OpenAI-compatible SDK. Set two headers on your request:

const res = await openai.chat.completions.create(
  { model: "gpt-4o", messages: [] },
  {
    headers: {
      "X-Grepture-Prompt": "support-reply",
      "X-Grepture-Vars": JSON.stringify({
        issue: ticket.text,
        tone: "friendly",
        company: "Acme",
      }),
    },
  },
);
HeaderDescription
X-Grepture-PromptPrompt slug. Append @draft or @v3 to pin a specific version.
X-Grepture-VarsJSON object of template variables.

The proxy resolves the template and replaces the request's messages array before forwarding. The messages array you pass in the request body is ignored.

Using prompts via SDK

The SDK exposes a grepture.prompt namespace with methods for every level of control.

prompt.use() — server-side resolution

Pass grepture.prompt.use() directly as the messages field. The proxy resolves the template server-side — zero extra roundtrips:

import OpenAI from "openai";
import { Grepture } from "@grepture/sdk";

const grepture = new Grepture({
  apiKey: process.env.GREPTURE_API_KEY!,
  proxyUrl: "https://proxy.grepture.com",
});

const openai = new OpenAI({
  ...grepture.clientOptions({
    apiKey: process.env.OPENAI_API_KEY!,
    baseURL: "https://api.openai.com/v1",
  }),
});

const res = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: grepture.prompt.use("support-reply", {
    variables: { issue: ticket.text, tone: "friendly" },
    // version: "draft",  // or 3, or omit for active version
  }),
});

prompt.assemble() — client-side resolution

Fetches the template and resolves it locally. Useful when you want to inspect or modify messages before sending:

const { messages } = await grepture.prompt.assemble("support-reply", {
  variables: { issue: ticket.text, tone: "friendly" },
});

// Append extra context before sending
messages.push({ role: "user", content: additionalContext });

const res = await openai.chat.completions.create({
  model: "gpt-4o",
  messages,
});

prompt.get() + prompt.resolve() — advanced control

Fetch the raw template once, resolve it multiple times with different variables:

const template = await grepture.prompt.get("support-reply");

const resolved1 = grepture.prompt.resolve(template.messages, vars1);
const resolved2 = grepture.prompt.resolve(template.messages, vars2);

prompt.list() — discover prompts

const prompts = await grepture.prompt.list();
for (const p of prompts) {
  console.log(`${p.slug} (v${p.active_version})`);
}

SDK method reference

MethodReturnsNetworkDescription
prompt.use(slug, opts?)PromptMessagesNoneMarker array for server-side resolution via clientOptions()
prompt.assemble(slug, opts?)Promise<AssembledPrompt>1 callFetch + resolve client-side. Returns { messages, metadata }
prompt.get(slug, opts?)Promise<PromptTemplate>1 callFetch raw template with {{handlebars}} intact
prompt.resolve(messages, vars)PromptMessage[]NonePure function — resolve template messages locally
prompt.list()Promise<PromptListItem[]>1 callList all prompts for the team

All methods accept an optional version option (number \| "draft"). Omit for the active (live) version.

promptHeaders() (low-level)

If you're not using clientOptions(), you can use the promptHeaders() helper directly to set headers on any request:

import { promptHeaders } from "@grepture/sdk";

const res = await openai.chat.completions.create(
  { model: "gpt-4o", messages: [] },
  {
    headers: promptHeaders({
      slug: "support-reply",
      variables: { issue: ticket.text, tone: "friendly" },
    }),
  },
);

Fetching prompt templates

If you want to resolve templates client-side, fetch the raw template from the proxy API:

GET /v1/prompts/support-reply
GET /v1/prompts/support-reply@draft
GET /v1/prompts/support-reply@v3

Returns the template messages, variables schema, and metadata. Requires a valid API key in the Authorization header.

Template syntax

Variable interpolation

Hello {{name}}, your order {{order_id}} is ready.

Missing variables resolve to an empty string.

Conditionals

{{#if premium}}
You are a premium customer and qualify for priority support.
{{else}}
Standard support response.
{{/if}}

Values are falsy if empty, "false", or "0".

Loops

{{#each items}}
- {{this}}
{{/each}}

Pass arrays as JSON strings in the variables object: { items: '["item1", "item2"]' }.

Testing

The Test Panel in the prompt editor lets you fill in variable values and see the resolved output without making an API call. Use it to verify your templates before publishing.

Rule pipeline integration

By default, resolved prompts flow through your guardrail rules like any other request. This means PII redaction, injection detection, and other rules apply to the final resolved messages.

Set Skip Rules on a prompt to bypass the pipeline entirely. Use this for system-only prompts where you control the content and don't want rules interfering.

A/B testing (experiments)

Run experiments to compare prompt versions using real production traffic. Split traffic between two or more published versions, measure quality with eval scores, and activate the winner — all without changing application code.

How it works

  1. Publish 2+ versions of a prompt (e.g., v3 and v4)
  2. Start an experiment from the prompt editor — select versions and set traffic weights (e.g., 50/50, 80/20)
  3. Traffic is split automatically — when the proxy resolves your prompt slug, it picks a version based on the weights. Each request gets an independent roll.
  4. Eval scores per version — Grepture auto-creates a Relevance evaluator for the prompt when the experiment starts. Scores appear per-version in the experiment results panel.
  5. Pick a winner — once you have enough data, end the experiment and activate the winning version with one click

Starting an experiment

In the prompt editor sidebar, click Start Experiment in the A/B Test card. Select which published versions to compare, set traffic weights (must sum to 100%), and start.

While an experiment is running:

  • All traffic using X-Grepture-Prompt: your-slug (without an explicit version) is randomly split between the variants
  • Traffic using an explicit version (your-slug@3) bypasses the experiment
  • The experiment results panel auto-refreshes every 30 seconds with per-version metrics: request count, avg latency, avg cost, and eval scores

Reading results

Each variant shows:

MetricDescription
RequestsNumber of requests routed to this version
Avg latencyAverage response time in milliseconds
Avg costAverage estimated cost per request
Eval scoresAverage score per evaluator (e.g., Relevance: 0.87)

The leading version (highest eval score) gets a trophy indicator and an Activate & End button.

Ending an experiment

Click End Experiment to stop traffic splitting. Optionally activate the winner in the same action. After ending, all traffic returns to the prompt's active_version.

Edit as Draft

Need to iterate on a variant? Click Edit as Draft when viewing any published version to copy its content into the draft. Edit, publish as a new version, and add it to a new experiment.