How do I get guaranteed JSON from Claude API?

Claude lacks a response_format flag like OpenAI, but three patterns work: tool use for structural guarantees, assistant prefill for shallow objects, or prompt-only for prototyping. Tool use with zod validation is most reliable for production.

What is assistant prefill in Claude?

Prefill lets you start Claude's response with an open brace by including a message with role assistant and content {. This forces valid JSON output while costing roughly 1 token of overhead compared to tool use.

When should I use tool use versus prefill?

Use tool use for nested schemas, enums, or objects with more than three fields. Use prefill for flat objects with two or three fields when token cost matters and you are running calls thousands of times daily.

Why does Claude sometimes return enum values that do not match the schema?

Claude respects enum constraints most of the time but occasionally synthesizes values when input is ambiguous. Always validate enums with zod or pydantic, and include allowed values in the tool description for critical enums.

Claude API Structured Output: 3 Patterns for Guaranteed JSON

March 25, 2026 · 9 min read · claude-api, structured-output, json, tool-use

If you come from the OpenAI SDK, you are used to response_format: { type: "json_object" } or strict JSON schema mode. You pass a schema, OpenAI enforces it at the decoder level, you get parseable JSON or an error. Simple.

Claude does not have that. There is no response_format flag, no strict schema decoder, no JSON mode toggle. If you ask Claude nicely for JSON in the prompt, it will usually comply. “Usually” is not a word I want in production. I run ten AI agents as cron scripts on a Debian VPS. Every one of them parses Claude output into typed objects downstream. One unescaped quote in a string field will take down the pipeline at 06:30 while I am asleep.

The fix is not one technique but three, ordered by reliability. Use tool use when you need structural guarantees. Use assistant prefill when the output is shallow and you want to save tokens. Use prompt-only when you are prototyping and do not care if it breaks. This post walks through all three with real TypeScript code and the exact tradeoffs I see in production.

Why Claude’s JSON story is different

Anthropic’s position is that tool use already solves structured output, so a separate JSON mode would be redundant. They are mostly right. A tool’s input schema is a JSON schema. When you force Claude to call a specific tool, the model’s output is constrained to match that schema. You get the same guarantees as OpenAI’s strict mode, just through a different door.

The catch is that tool use carries schema overhead on every call (roughly 20 to 40 tokens depending on schema size) and returns the output nested inside a tool_use content block rather than as the top-level message. For shallow extraction, that overhead feels excessive. That is where prefill comes in.

Prefill exploits a simple fact: Claude’s API lets you start the assistant’s response for it. If the last message in the list has role: "assistant" and content "{", Claude must continue from that open brace. It cannot preface with “Sure, here is the JSON”. The next token has to be valid JSON content. This is the pattern I use in most of my cron scripts because it is cheap, fast, and covers the 90% case.

Prompt-only is the “please return JSON” approach. It works until it does not, and when it breaks, it breaks in ways that are hard to catch with simple validation. I only use it for throwaway scripts.

Pattern 1: Tool use as a schema contract

This is the most reliable pattern and what I reach for when the output shape matters. The trick is to define a tool whose entire purpose is to receive your structured data, then force Claude to call it.

Here is an end-to-end example. I want to extract contact info from a raw email body: sender name, email address, intent (one of a fixed enum), and priority score.

import Anthropic from "@anthropic-ai/sdk";
import { z } from "zod";

const client = new Anthropic();

// 1. Define the schema in zod for runtime validation.
const ContactSchema = z.object({
  name: z.string(),
  email: z.string().email(),
  intent: z.enum(["sales", "support", "spam", "other"]),
  priority: z.number().int().min(1).max(5),
});
type Contact = z.infer<typeof ContactSchema>;

// 2. Define the tool. The input_schema is what Claude fills in.
const extractContactTool = {
  name: "extract_contact",
  description: "Extract contact details from an email body.",
  input_schema: {
    type: "object" as const,
    properties: {
      name: { type: "string", description: "Sender's full name" },
      email: { type: "string", description: "Sender's email address" },
      intent: {
        type: "string",
        enum: ["sales", "support", "spam", "other"],
        description: "Primary reason for the email",
      },
      priority: {
        type: "integer",
        minimum: 1,
        maximum: 5,
        description: "1 = low, 5 = urgent",
      },
    },
    required: ["name", "email", "intent", "priority"],
  },
};

async function extractContact(emailBody: string): Promise<Contact> {
  const response = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 512,
    tools: [extractContactTool],
    tool_choice: { type: "tool", name: "extract_contact" },
    messages: [
      {
        role: "user",
        content: `Extract contact details from this email:\n\n${emailBody}`,
      },
    ],
  });

  // 3. Pull the tool_use block out of the response.
  const toolUse = response.content.find((b) => b.type === "tool_use");
  if (!toolUse || toolUse.type !== "tool_use") {
    throw new Error("Claude did not call the tool");
  }

  // 4. Validate. Tool use is reliable but not infallible for enums.
  return ContactSchema.parse(toolUse.input);
}

Two things make this bulletproof. First, tool_choice: { type: "tool", name: "extract_contact" } forces Claude to call this specific tool. It cannot return prose. Second, the input_schema constrains the output shape at the decoder. Required fields will be present. Types will match.

The zod parse at the end is a belt-and-suspenders check. I have seen Claude occasionally fuzz enum values (returning "SALES" instead of "sales" or synthesizing a value not in the list) when the email content is ambiguous. Validation catches that and lets me retry or fall back.

For readers coming from the OpenAI ecosystem, the migration pattern is straightforward. If you want the full mapping, I wrote it up in migrating from OpenAI structured output to Claude and a deeper look at Claude tool use patterns.

Pattern 2: Assistant prefill

When the output is a flat object with two or three fields, tool use feels heavy. Prefill gives you 90% of the reliability at 1 token of overhead instead of 20 to 40.

async function extractContactPrefill(emailBody: string): Promise<Contact> {
  const response = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 256,
    messages: [
      {
        role: "user",
        content: `Extract contact details from this email as JSON with keys name, email, intent (one of: sales, support, spam, other), priority (1-5 integer).\n\nEmail:\n${emailBody}\n\nReturn ONLY the JSON object.`,
      },
      {
        role: "assistant",
        content: "{",
      },
    ],
  });

  const text = response.content[0];
  if (text.type !== "text") throw new Error("Unexpected content type");

  // Claude's output starts AFTER the prefill, so we prepend the "{".
  const raw = "{" + text.text;

  // Sometimes Claude closes with a trailing explanation. Strip it.
  const jsonEnd = raw.lastIndexOf("}");
  const cleaned = raw.slice(0, jsonEnd + 1);

  return ContactSchema.parse(JSON.parse(cleaned));
}

The prefill "{" is non-negotiable as the next token Claude sees. It cannot write “Here is the JSON:” first. It must continue valid JSON.

This is the pattern I use in production for my morning briefing and agenda follow-up cron scripts. The script calls claude -p with a JSON instruction and prefill, pipes the output through jq, and posts the result to Telegram. Prefill plus post-parse validation has held up reliably across hundreds of daily runs. When it does fail, it fails at the JSON.parse step, which I catch and log.

Prefill’s weak spot is nested structures. For a flat object, Claude will close the brace and stop. For deeply nested schemas with optional fields, it sometimes writes trailing commas or omits closing brackets under pressure. If your schema has more than one level of nesting, move to tool use.

Pattern 3: Prompt-only (and why not to)

This is the naive approach. Ask for JSON in the prompt, hope for the best.

// DO NOT use this in production.
async function extractContactBad(emailBody: string): Promise<Contact> {
  const response = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 512,
    messages: [
      {
        role: "user",
        content: `Return a JSON object with name, email, intent, priority from this email:\n${emailBody}`,
      },
    ],
  });

  const text = (response.content[0] as any).text;
  return ContactSchema.parse(JSON.parse(text));
}

This works maybe 85% of the time. The other 15% you get:

"Sure, here is the extracted contact:\n\n{...}" (preamble breaks parse)
Markdown code fences: ```json\n{...}\n``` (fences break parse)
Trailing explanation: {...}\n\nNote that the priority is 5 because...
Single quotes instead of double quotes
Unescaped newlines inside string values

You can build a regex pipeline to rescue most of these. By the time you have, you have reinvented a worse version of prefill. Skip the step. Use Pattern 1 or Pattern 2.

Validating the output

No matter which pattern you pick, validate the parsed object before it touches the rest of your pipeline. Tool use gets you structural guarantees. It does not save you from semantic errors like Claude hallucinating an email address that is syntactically valid but fictional.

My production pattern is a two-step fallback:

async function extractWithFallback(emailBody: string): Promise<Contact | null> {
  try {
    return await extractContact(emailBody);
  } catch (err) {
    console.error("Tool use failed, retrying with prefill", err);
    try {
      return await extractContactPrefill(emailBody);
    } catch (err2) {
      console.error("Both patterns failed, skipping", err2);
      return null;
    }
  }
}

Retry once with a different pattern, then give up and log. Crashing the whole pipeline over a single bad email is worse than dropping one record.

For read-heavy workloads where the same system prompt or tool schema repeats across calls, you can layer this with Claude’s prompt caching to shave the schema tokens off most calls. Schema overhead goes from 20 to 40 tokens per call to roughly 2 cached tokens once the cache is warm.

Edge cases that trip Claude up

A few patterns need extra care compared to OpenAI.

Enums. Claude respects enum constraints in tool schemas most of the time but will occasionally synthesize values under ambiguity. Always validate enums in zod or pydantic. For critical enums, include the allowed values in the tool description too, not just the enum property. The redundancy helps.

Optional fields. Claude will sometimes include optional fields with null values, sometimes omit them. Your schema needs .optional().nullable() or the equivalent. Decide which you want and be explicit.

Deeply nested objects. More than two levels of nesting and Claude starts hallucinating structure. Flatten where you can. If you need deep nesting, split into multiple tool calls or use extended thinking so the model plans the output before writing it.

Integer vs float. A field typed as integer will sometimes come back as 3.0. Coerce in your validator.

Which pattern when

Use tool use when: the schema is nested, enums matter, the object has more than three fields, or the output feeds a typed downstream system. This is the default for anything in production.
Use prefill when: the output is a flat object with two or three fields, you are running the call thousands of times a day and token cost matters, or you want the response as plain text for easier logging. This is what I use in most cron scripts.
Use prompt-only when: you are in a Jupyter notebook exploring, or the output is literally one field and you can regex it out. Never in production.

The pattern I reach for first is tool use with a zod validation fallback. It costs a handful of tokens more than prefill, and it buys me sleep. When 06:30 comes and the cron fires, I want the JSON to be JSON.

From pilot to production

Running an AI pilot that is not production-ready yet? That is exactly what I do: audit, fixed-price scope, delivery in 2–6 weeks.

Make your AI pilot production-ready → Production audit ($1,900 fixed)

Claude API Structured Output: 3 Patterns for Guaranteed JSON

Why Claude’s JSON story is different

Pattern 1: Tool use as a schema contract

Pattern 2: Assistant prefill

Pattern 3: Prompt-only (and why not to)

Validating the output

Edge cases that trip Claude up

Which pattern when

Before you go —

Almost there

Claude API Structured Output: 3 Patterns for Guaranteed JSON

Why Claude’s JSON story is different

Pattern 1: Tool use as a schema contract

Pattern 2: Assistant prefill

Pattern 3: Prompt-only (and why not to)

Validating the output

Edge cases that trip Claude up

Which pattern when

Scope my automation in 24h

Request received