Home Posts Structured Outputs in Production: Schema, Retry, Validate
AI Engineering

Structured Outputs in Production: Schema, Retry, Validate

Structured Outputs in Production: Schema, Retry, Validate
Dillip Chowdary
Dillip Chowdary
Tech Entrepreneur & Innovator · June 04, 2026 · 7 min read

Bottom Line

Structured outputs are not a replacement for validation; they are the first contract in a larger production loop. Design small schemas, validate locally, retry only recoverable failures, and log schema versions with every request.

Key Takeaways

  • Use small, closed schemas with required fields and explicit enums.
  • Keep runtime validation even when the provider supports strict structured output.
  • Retry transport, truncation, and validator failures differently.
  • Version schemas so downstream services can track contract drift.

Structured outputs make LLM APIs usable in systems that expect contracts, not prose. Instead of asking a model to "return JSON" and hoping your parser survives, you provide a schema, validate the response, and route failures through a retry policy. This tutorial builds a production-ready TypeScript pattern using JSON Schema, OpenAI's structured output mode, Zod validation, bounded retries, and observable error handling.

Prerequisites

Prerequisites box

  • Node.js project with TypeScript enabled.
  • An LLM API key available as OPENAI_API_KEY.
  • Basic familiarity with JSON Schema and runtime validators.
  • Install dependencies: npm install openai zod zod-to-json-schema.

Bottom Line

Structured output is a contract boundary, not a magic trust boundary. Keep schemas small, validate locally, and make retries conditional on the exact failure mode.

OpenAI's official documentation describes structured outputs as JSON responses that adhere to a supplied schema when configured with strict schema adherence. The same docs note that only a subset of JSON Schema is supported in strict mode, and that refusals or token limits can still prevent a normal object from arriving. That is why production code still needs validation and failure routing.

1. Design the Schema

Start with the smallest useful object. A schema that tries to encode every business rule becomes brittle, expensive to reason about, and harder to migrate. Put structural requirements in the schema, then enforce domain rules in application code.

Use a closed contract

  • Require every field the downstream service needs.
  • Use enums for workflow decisions instead of free-form labels.
  • Set additionalProperties to false so unexpected fields cannot silently appear.
  • Include a schema_version field when responses are stored, replayed, or consumed asynchronously.
import { z } from "zod";

export const TicketTriageSchema = z.object({
  schema_version: z.literal("ticket_triage.v1"),
  category: z.enum(["billing", "bug", "account", "security", "other"]),
  priority: z.enum(["low", "medium", "high", "urgent"]),
  customer_visible_summary: z.string().min(20).max(240),
  needs_human_review: z.boolean(),
  confidence: z.number().min(0).max(1)
}).strict();

export type TicketTriage = z.infer<typeof TicketTriageSchema>;

If prompts or sample tickets contain private data, sanitize fixtures before adding them to tests or bug reports. TechBytes' Data Masking Tool is useful for removing emails, phone numbers, and customer identifiers from structured-output examples.

2. Call the API

The provider schema should come from the same source as your runtime validator. In TypeScript, that usually means authoring a Zod schema and converting it to JSON Schema for the API call. Do not maintain two hand-written schemas unless you also maintain a test that proves they match.

import OpenAI from "openai";
import { zodToJsonSchema } from "zod-to-json-schema";
import { TicketTriageSchema } from "./schema";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const model = process.env.OPENAI_MODEL ?? "gpt-4o-2024-08-06";

export async function triageTicket(input: string) {
  const jsonSchema = zodToJsonSchema(TicketTriageSchema, {
    name: "ticket_triage"
  });

  return client.responses.create({
    model,
    input: [
      {
        role: "system",
        content: "Classify the support ticket. Return only the requested structured object."
      },
      { role: "user", content: input }
    ],
    text: {
      format: {
        type: "json_schema",
        name: "ticket_triage",
        strict: true,
        schema: jsonSchema.definitions?.ticket_triage ?? jsonSchema
      }
    }
  });
}
Watch out: Provider schema support is not identical to a full local JSON Schema engine. Keep the schema simple and test the exact generated schema against the API before promoting it.

3. Validate and Retry

Retries should be boring and specific. A retry for a transient network error is not the same as a retry for a safety refusal, a truncated response, or a local validator failure. Treat each case differently so your service does not amplify cost or hide bad data.

Classify failure modes

  • Transport errors: retry with exponential backoff and jitter.
  • Rate limits: honor provider retry timing when available and cap concurrency.
  • Truncation: retry with a smaller prompt, higher output allowance, or narrower schema.
  • Refusal: do not retry blindly; return a policy-aware application error.
  • Validator failure: retry once with the same schema, then quarantine the sample for review.
import { TicketTriageSchema, type TicketTriage } from "./schema";
import { triageTicket } from "./llm";

function extractOutputText(response: any): string {
  const text = response.output_text;
  if (typeof text !== "string" || text.length === 0) {
    throw new Error("missing_output_text");
  }
  return text;
}

export async function triageWithRetry(ticket: string): Promise<TicketTriage> {
  const maxAttempts = 3;

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      const response = await triageTicket(ticket);
      const parsedJson = JSON.parse(extractOutputText(response));
      const parsed = TicketTriageSchema.safeParse(parsedJson);

      if (!parsed.success) {
        throw new Error(`validator_failed:${parsed.error.issues[0]?.path.join(".")}`);
      }

      return parsed.data;
    } catch (error) {
      if (attempt === maxAttempts || String(error).includes("refusal")) {
        throw error;
      }

      const delayMs = 250 * 2 ** (attempt - 1) + Math.floor(Math.random() * 100);
      await new Promise((resolve) => setTimeout(resolve, delayMs));
    }
  }

  throw new Error("unreachable_retry_state");
}

For production, attach structured logs to each attempt. The minimum useful fields are schema_version, model, attempt, failure_type, latency, and request correlation ID. Avoid logging raw customer prompts unless your privacy policy and retention controls allow it.

Verification

Use a deterministic fixture before sending live traffic. The expected output should pass JSON.parse, pass TicketTriageSchema.safeParse, and contain no fields outside the schema.

const result = await triageWithRetry(
  "Customer says they were charged twice after upgrading their plan."
);

console.log(result);

Expected output

{
  "schema_version": "ticket_triage.v1",
  "category": "billing",
  "priority": "medium",
  "customer_visible_summary": "The customer reports a duplicate charge after upgrading their plan.",
  "needs_human_review": true,
  "confidence": 0.86
}
  • The object has exactly six fields.
  • category and priority are valid enum values.
  • confidence is a number between 0 and 1.
  • The schema version is stable for dashboards and replay jobs.

Troubleshooting

1. The API rejects the schema

Your generated schema may include unsupported keywords or nested constructs. Simplify the schema, remove advanced constraints, and test the exact JSON sent to the provider. Keep local validation stricter than provider validation when necessary.

2. The response validates but is semantically wrong

Schema validation proves shape, not truth. Add eval fixtures with known answers, track per-field accuracy, and use application rules for facts the model cannot infer reliably. For high-risk actions, route low-confidence or security-sensitive outputs to human review.

3. Retries increase cost without improving success

Stop retrying every failure as if it were transient. Break metrics down by failure type, prompt size, schema version, and model. If validator failures cluster around one field, redesign that field instead of increasing retry count.

What's Next

Once the basic loop is stable, move from ad hoc examples to contract tests and observability. Structured outputs are most valuable when every consumer can prove which schema version it accepted and why a failure was retried or rejected.

  1. Add a golden fixture suite for common, adversarial, and empty inputs.
  2. Emit metrics for validation failure rate, retry rate, refusal rate, and p95 latency.
  3. Store schema versions beside outputs so migrations are auditable.
  4. Run canaries before changing prompts, models, or schema fields.
  5. Document the contract for downstream teams using examples that pass the validator.
Pro tip: Treat a schema change like an API change. Review it, test it, deploy it gradually, and keep rollback examples ready.

Frequently Asked Questions

Do structured outputs mean I can skip JSON validation? +
No. Provider-side structured output reduces malformed responses, but your application still needs runtime validation. Local validation catches provider limitations, schema conversion mistakes, unsafe assumptions, and downstream contract drift.
What is the best schema design for LLM structured outputs? +
Use a small closed object with required fields, explicit enums, and additionalProperties: false. Keep semantic business rules in application code unless the schema constraint is simple and well supported by the provider.
How many retries should an LLM structured output call use? +
Start with at most 2-3 attempts and classify the failure before retrying. Transport errors and truncation can be retryable; refusals and repeated validator failures usually need a different prompt, schema, or human review path.
Should I use Zod, Pydantic, or raw JSON Schema? +
Use the validator native to your application stack, then generate provider JSON Schema from that source when possible. Zod is a strong fit for TypeScript services, while Pydantic is the common choice for Python services.

Get Engineering Deep-Dives in Your Inbox

Weekly breakdowns of architecture, security, and developer tooling — no fluff.

Found this useful? Share it.