Home Posts Structured Outputs in Production: Schemas & Retries
AI Engineering

Structured Outputs in Production: Schemas & Retries

Structured Outputs in Production: Schemas & Retries
Dillip Chowdary
Dillip Chowdary
Tech Entrepreneur & Innovator · June 18, 2026 · 7 min read

Bottom Line

Structured outputs move JSON correctness into the API contract, but they do not remove the need for local validation, bounded retries, schema versioning, and observability.

Key Takeaways

  • Use small closed schemas with required fields, enums, and explicit schema versions.
  • Validate locally even when the provider enforces structured output.
  • Retry transport, truncation, and semantic validation failures differently.
  • Log schema version, model, attempt count, and failure class for every call.

Structured outputs make LLM APIs behave more like normal service boundaries: you send input, declare a response contract, and expect machine-readable data back. That is a major improvement over prompt-only JSON, but it is not the whole production story. On June 18, 2026, the reliable pattern is schema-first design, provider-side enforcement, local semantic validation, bounded retries, and logs that explain every accepted or rejected object.

  • Use a small schema before adding business rules.
  • Keep validation in your application, not only at the provider boundary.
  • Retry by failure class, not by instinct.
  • Version every schema that downstream code depends on.

1. Design the Contract

Prerequisites

  • A TypeScript service with Node.js and environment-based API credentials.
  • Basic familiarity with JSON Schema and runtime validators.
  • Dependencies: openai, zod, and zod-to-json-schema.
  • Sample records with secrets removed; use the Data Masking Tool before pasting production text into tests.

Bottom Line

Structured output is a contract boundary, not a trust boundary. Keep schemas small, validate locally, and make retries conditional on the exact failure mode.

Start with the smallest object your application can safely consume. Provider docs for structured outputs describe schema-constrained JSON, but they also note that only supported subsets of JSON Schema are valid in strict modes. That means schema design should be conservative.

import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

export const Extraction = z.object({
  schema_version: z.literal('invoice-extraction-v1'),
  vendor: z.string().min(1),
  invoice_id: z.string().min(1),
  currency: z.enum(['USD', 'EUR', 'GBP']),
  total_cents: z.number().int().nonnegative(),
  confidence: z.enum(['low', 'medium', 'high'])
}).strict();

export const extractionJsonSchema = zodToJsonSchema(Extraction, {
  name: 'invoice_extraction'
});

Use this split deliberately:

  • JSON Schema defines the provider-facing shape.
  • Zod validates the returned object inside your service.
  • schema_version lets downstream jobs reject stale contracts safely.

2. Call the API

The API call should request structured output explicitly. The example below uses the OpenAI Responses API with a strict schema format. The exact model should be one your account supports for structured outputs; pin it in config so you can roll forward deliberately.

import OpenAI from 'openai';
import { extractionJsonSchema } from './schema';

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function extractInvoice(rawText) {
  const response = await client.responses.create({
    model: process.env.OPENAI_MODEL || 'gpt-4o-mini',
    input: [
      {
        role: 'system',
        content: 'Extract invoice fields. Return only data supported by the schema.'
      },
      { role: 'user', content: rawText }
    ],
    text: {
      format: {
        type: 'json_schema',
        name: 'invoice_extraction',
        strict: true,
        schema: extractionJsonSchema
      }
    }
  });

  return response.output_text;
}

Keep the prompt boring. The schema should carry the structure; the instructions should describe task boundaries and what to do with missing or incompatible input. If a field is unknown, prefer an explicit enum such as low confidence over inventing a value.

Watch out: Do not paste sensitive production documents into local fixtures. Mask names, emails, account numbers, and tokens before creating regression tests.

3. Validate and Retry

Provider-side structured output reduces malformed JSON, but your service still needs runtime validation. Validation catches unsupported schema conversions, semantic gaps, provider refusals, truncation, and downstream assumptions that the schema cannot express.

import { Extraction } from './schema';
import { extractInvoice } from './llm';

function classifyError(error) {
  if (error.name === 'AbortError') return 'transport';
  if (String(error.message).includes('rate')) return 'rate_limit';
  return 'unknown';
}

export async function extractWithRetry(rawText, maxAttempts = 3) {
  let lastError;

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      const text = await extractInvoice(rawText);
      const parsedJson = JSON.parse(text);
      const parsed = Extraction.safeParse(parsedJson);

      if (parsed.success) {
        return { ok: true, value: parsed.data, attempts: attempt };
      }

      lastError = { type: 'validation', details: parsed.error.issues };
    } catch (error) {
      lastError = { type: classifyError(error), details: String(error.message) };
    }

    if (lastError.type === 'validation' && attempt === maxAttempts) break;
    await new Promise(resolve => setTimeout(resolve, 250 * attempt));
  }

  return { ok: false, error: lastError, attempts: maxAttempts };
}

Treat retry policy as part of the contract:

  • Transport failures can usually retry with backoff.
  • Rate limits should respect provider guidance and queue pressure.
  • Validation failures deserve one retry at most before review or fallback.
  • Refusals should not be hidden behind blind retries.

Log the model, schema version, attempt count, latency, and failure class. Do not log raw prompts unless your retention and privacy rules explicitly allow it.

Verification and Expected Output

Run a fixture through the wrapper and assert on the typed result, not the raw model text. Your test should prove that the contract accepted the object and that downstream code receives normalized fields.

const fixture = 'Invoice ACME-42 from Contoso. Total USD 19.99.';
const result = await extractWithRetry(fixture);

console.log(result);

Expected output should look like this:

{
  ok: true,
  value: {
    schema_version: 'invoice-extraction-v1',
    vendor: 'Contoso',
    invoice_id: 'ACME-42',
    currency: 'USD',
    total_cents: 1999,
    confidence: 'high'
  },
  attempts: 1
}

For production verification, add these checks to CI:

  • Golden fixtures for normal, missing, ambiguous, and adversarial inputs.
  • Contract tests that fail when schema_version changes without migration notes.
  • Metrics for validation failure rate, retry rate, and average attempts per success.

Troubleshooting Top 3

1. The API rejects the schema

  • Remove unsupported keywords and deeply nested objects.
  • Set closed object behavior with additionalProperties: false when required by the provider.
  • Prefer simple enums and required fields over complex unions.

2. The JSON parses but validation fails

  • Check whether your schema generator changed optional fields or numeric types.
  • Add clearer field descriptions in the provider schema.
  • Split one overloaded extraction task into two smaller calls.

3. Retries increase cost without improving success

  • Log the failure class before retrying.
  • Stop retrying repeated semantic failures with the same prompt.
  • Route low-confidence or incompatible input to a fallback workflow.

What's Next

Once the loop works, move from examples to governance. Store schemas beside the code that consumes them, publish contract changes in release notes, and build dashboards around validation failures. For larger systems, add a replay harness that runs old fixtures against new prompts, models, and schema versions before deployment.

The mature production pattern is simple: structured output at the provider, runtime validation in the service, typed events in logs, and explicit fallback paths. That combination turns LLM output from a fragile text blob into an observable API boundary your engineering team can operate.

Frequently Asked Questions

Do structured outputs mean I can skip JSON validation? +
No. Provider-side structured output improves schema adherence, but your application should still validate with Zod, Pydantic, or another runtime validator. Local validation catches semantic errors, conversion bugs, and downstream contract drift.
What is the best JSON Schema design for LLM structured outputs? +
Use a small closed object with required fields, explicit enums, and a schema version. Avoid deeply nested structures and complex unions unless the provider explicitly supports them in structured output mode.
How many retries should a structured LLM call use? +
Start with two or three total attempts and classify the failure before retrying. Transport errors and truncation may be retryable, while refusals and repeated validation failures usually need a prompt, schema, or workflow change.
Should I use function calling or response-format structured outputs? +
Use function calling when the model is selecting or invoking tools in your system. Use response-format structured outputs when you need the assistant's final answer to match a JSON contract for parsing, storage, or UI rendering.

Get Engineering Deep-Dives in Your Inbox

Weekly breakdowns of architecture, security, and developer tooling — no fluff.

Found this useful? Share it.