Structured output with schemas

Free-form text is fine for a person to read, but code needs structure. When you want to feed a model's answer into a database, an API, or the next step of a program, you need predictable fields — not a paragraph. Schemas turn the model into a reliable data source.

Overview

A schema is a contract: it names the fields you expect, their types, and which are required. You give the model the schema, ask for JSON only, then validate the response against the schema before using it. Validation matters because the model can still drift, and you want to catch a bad shape before it corrupts your data.

Key ideas

Describe the shape with a model class

In Python, Pydantic is the standard way to define and validate a schema.

from pydantic import BaseModel, Field
 
class Dish(BaseModel):
    name: str
    price_inr: int = Field(gt=0)
    is_vegetarian: bool
    spice_level: str  # mild | medium | hot

Ask for JSON that matches

Put the schema in the prompt and demand JSON only. The cleaner your instruction, the less cleanup you do.

import json
from anthropic import Anthropic
 
client = Anthropic()
 
schema_hint = json.dumps(Dish.model_json_schema(), indent=2)
 
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=300,
    system="Extract dish details. Respond with only valid JSON. No prose, no markdown fences.",
    messages=[{
        "role": "user",
        "content": f"Schema:\n{schema_hint}\n\nText: Ghee dosa, 90 rupees, veg, medium spicy.",
    }],
)
 
raw = response.content[0].text

Always validate before you trust

Parsing is not enough — confirm the data obeys the contract. Pydantic raises a clear error if it does not.

from pydantic import ValidationError
 
try:
    dish = Dish.model_validate_json(raw)
    print(dish.name, dish.price_inr)
except ValidationError as e:
    # Log the raw output, then retry or fall back
    print("Schema mismatch:", e)

Prefer native structured output when available

Many providers offer a tool-use or structured-output mode that guarantees the response conforms to a schema, removing most parsing failures. Use it when available — it is more reliable than parsing free text. You will see this same mechanism again in Week 3 for tool calling.

Quick recap

A schema is a contract: field names, types, and what is required.
Put the schema in the prompt and ask for JSON only, no prose or fences.
Always validate the response against the schema before using it.
Use native structured-output or tool-use modes when the provider offers them.
Treat model output as untrusted input.

Overview

Key ideas

In Python, Pydantic is the standard way to define and validate a schema.

from pydantic import BaseModel, Field
 
class Dish(BaseModel):
    name: str
    price_inr: int = Field(gt=0)
    is_vegetarian: bool
    spice_level: str  # mild | medium | hot

Put the schema in the prompt and demand JSON only. The cleaner your instruction, the less cleanup you do.

import json
from anthropic import Anthropic
 
client = Anthropic()
 
schema_hint = json.dumps(Dish.model_json_schema(), indent=2)
 
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=300,
    system="Extract dish details. Respond with only valid JSON. No prose, no markdown fences.",
    messages=[{
        "role": "user",
        "content": f"Schema:\n{schema_hint}\n\nText: Ghee dosa, 90 rupees, veg, medium spicy.",
    }],
)
 
raw = response.content[0].text

Parsing is not enough — confirm the data obeys the contract. Pydantic raises a clear error if it does not.

from pydantic import ValidationError
 
try:
    dish = Dish.model_validate_json(raw)
    print(dish.name, dish.price_inr)
except ValidationError as e:
    # Log the raw output, then retry or fall back
    print("Schema mismatch:", e)

Quick recap

A schema is a contract: field names, types, and what is required.

Put the schema in the prompt and ask for JSON only, no prose or fences.

Always validate the response against the schema before using it.

Use native structured-output or tool-use modes when the provider offers them.

Treat model output as untrusted input.