How to Monitor and Log All LLM API Calls in One Place

Get unified logging across OpenAI, Anthropic, Google, and Azure. See every request, response, token count, and latency — with a single proxy. No custom logging code.

The problem: scattered providers, scattered logs

You're calling OpenAI for chat, Anthropic for code review, Google for embeddings, and Azure for a fine-tuned model. Each provider has its own logging story — and most of them don't have one at all.

// Four providers. Zero unified view of what's happening.
await openai.chat.completions.create({ model: "gpt-4o", messages });
await anthropic.messages.create({ model: "claude-sonnet-4-5-20250514", messages });
await google.generateContent({ model: "gemini-2.0-flash", contents });
await azure.chat.completions.create({ model: "gpt-4o", messages });

When a prompt produces a bad response, you can't inspect what was sent. When latency spikes, you can't tell which model is slow. When a customer reports a bug, you're guessing — because the actual request and response aren't logged anywhere you can search.

Provider dashboards show aggregate metrics. They don't show you the prompt that caused the hallucination at 3:47 PM yesterday.

The solution: unified logging with Grepture

Grepture is an AI gateway that sits between your application and every LLM provider. Every request flowing through the proxy is automatically logged — request body, response body, token counts, latency, HTTP status, model, and which detection rules matched.

No custom logging middleware. No console.log(JSON.stringify(messages)). Route your traffic through the proxy and every call is captured in a single, searchable traffic log.

Setup in 3 minutes

1. Install the SDK

npm install @grepture/sdk

2. Get your API key

Sign up at grepture.com/en/pricing — the free plan includes 1,000 requests/month. Copy your API key from the dashboard.

3. Route your AI traffic through the proxy

OpenAI

import OpenAI from "openai";
import { Grepture } from "@grepture/sdk";

const grepture = new Grepture({
  apiKey: process.env.GREPTURE_API_KEY!,
  proxyUrl: "https://proxy.grepture.com",
});

const openai = new OpenAI({
  ...grepture.clientOptions({
    apiKey: process.env.OPENAI_API_KEY!,
    baseURL: "https://api.openai.com/v1",
  }),
});

// Every request is now logged automatically
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Summarize this document..." }],
});

Anthropic

const anthropic = new OpenAI({
  ...grepture.clientOptions({
    apiKey: process.env.ANTHROPIC_API_KEY!,
    baseURL: "https://api.anthropic.com/v1",
  }),
});

Google Gemini

const gemini = new OpenAI({
  ...grepture.clientOptions({
    apiKey: process.env.GEMINI_API_KEY!,
    baseURL: "https://generativelanguage.googleapis.com/v1beta/openai",
  }),
});

Azure OpenAI

const azure = new OpenAI({
  ...grepture.clientOptions({
    apiKey: process.env.AZURE_OPENAI_API_KEY!,
    baseURL: "https://your-resource.openai.azure.com/openai/deployments/your-deployment",
  }),
});

Any HTTP API

For providers without an OpenAI-compatible SDK, use grepture.fetch():

const response = await grepture.fetch("https://api.example.com/v1/generate", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${process.env.PROVIDER_API_KEY}`,
  },
  body: JSON.stringify({ prompt: "..." }),
});

What gets logged

Every request through the proxy captures:

  • Request and response bodies — the full prompt and completion, inspectable in the dashboard
  • Token counts — input tokens, output tokens, and total for every call
  • Latency — round-trip time from your app to the provider and back
  • HTTP status — success, rate limit, auth failure, timeout
  • Model — which model handled the request
  • Detection rules matched — which Grepture rules fired (PII, secrets, prompt injection)
  • Request ID — unique identifier for every call, available in code via requestId on the response

You can access detection metadata programmatically too:

const response = await grepture.fetch("https://api.openai.com/v1/chat/completions", {
  method: "POST",
  headers: { Authorization: `Bearer ${process.env.OPENAI_API_KEY}` },
  body: JSON.stringify({ model: "gpt-4o", messages }),
});

console.log(response.requestId);    // "req_abc123..."
console.log(response.rulesApplied); // ["pii-email", "pii-phone"]

Using the traffic log

The dashboard's Traffic Log page is where you'll spend most of your time:

  • Filterable table — search by status code, HTTP method, model, URL, or time window
  • Request detail view — click any row to see the full request and response, including headers, body, token counts, latency, and which rules matched
  • 30-day traffic chart — spot trends, spikes, and anomalies at a glance

For organizations that need logging without storing prompt content, zero-data mode (Business+) captures operational metadata — status, tokens, latency, model — without persisting request or response bodies.

Conversation tracing

AI agents and multi-turn conversations make dozens of LLM calls per user interaction. Without grouping, they're just noise in a flat log. Use trace IDs to link related requests into a single timeline.

Set a trace ID at construction

const grepture = new Grepture({
  apiKey: process.env.GREPTURE_API_KEY!,
  proxyUrl: "https://proxy.grepture.com",
  traceId: `session-${crypto.randomUUID().slice(0, 12)}`,
});

const openai = new OpenAI({
  ...grepture.clientOptions({
    apiKey: process.env.OPENAI_API_KEY!,
    baseURL: "https://api.openai.com/v1",
  }),
});

// Both calls are grouped under the same trace
await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Plan the migration steps." }],
});

await openai.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Execute step 1..." }],
});

Change trace mid-session

When a new conversation starts or an agent begins a separate run, switch the trace:

// New user session — new trace
grepture.setTraceId(`session-${crypto.randomUUID().slice(0, 12)}`);

// Stop tracing
grepture.setTraceId(undefined);

In the dashboard's Traces tab, you'll see all requests grouped by trace with a combined timeline, total token count, and aggregate latency. This turns a wall of individual requests into a readable conversation history.

Framework integration

LangChain

Wrap the underlying OpenAI client that LangChain uses:

import { ChatOpenAI } from "@langchain/openai";
import { Grepture } from "@grepture/sdk";

const grepture = new Grepture({
  apiKey: process.env.GREPTURE_API_KEY!,
  proxyUrl: "https://proxy.grepture.com",
});

const model = new ChatOpenAI({
  modelName: "gpt-4o",
  ...grepture.clientOptions({
    apiKey: process.env.OPENAI_API_KEY!,
    baseURL: "https://api.openai.com/v1",
  }),
});

Every LangChain call — chains, agents, tools — now flows through the proxy and appears in your traffic log.

Vercel AI SDK

import { createOpenAI } from "@ai-sdk/openai";
import { Grepture } from "@grepture/sdk";

const grepture = new Grepture({
  apiKey: process.env.GREPTURE_API_KEY!,
  proxyUrl: "https://proxy.grepture.com",
});

const openai = createOpenAI({
  ...grepture.clientOptions({
    apiKey: process.env.OPENAI_API_KEY!,
    baseURL: "https://api.openai.com/v1",
  }),
});

Next steps