instructor Cheatsheet

A visual guide to instructor for structured LLM output: patch a client, set a response_model, and turn ragged model text into a validated, typed Pydantic object, with nested models, automatic retries, field-guided extraction, streaming, and raw-completion access.

python
instructor
llm
cheatsheet
Author

James Balamuta

Published

July 12, 2026

instructor turns ragged LLM text into validated, typed Python objects. It does not replace your model SDK, it wraps it: you patch any OpenAI or Anthropic client, pass response_model=YourModel, and get back a real Python object, not a string you have to parse and not JSON you have to trust. The recurring mental model in this sheet is one picture: a prompt on the left flows along a gray arrow into a model (the blue create() step), which returns ragged text, and instructor parses and validates that text into a typed object with an accent-green border on the right. Color the object green when it validates, red when a field fails, amber while a retry is in flight. Where this looks like a raw model SDK call, the contrast is the point: the SDK hands you a string, instructor hands you a validated typed object. The conventional imports are import instructor and from pydantic import BaseModel, Field, and everything here reflects the current 2026 API (from_provider, response_model=, the instructor.Mode enum; legacy patch, OpenAISchema, and @openai_function spellings are flagged per section).

Complete instructor cheatsheet (light mode): eight panels covering patching a client, defining a response_model, describing fields to guide the model, nested and list models, extracting one typed object, automatic retries on validation failure, streaming partial objects, and getting the raw completion alongside the typed object.

Complete instructor cheatsheet (dark mode): eight panels covering patching a client, defining a response_model, describing fields to guide the model, nested and list models, extracting one typed object, automatic retries on validation failure, streaming partial objects, and getting the raw completion alongside the typed object.

Download the full cheatsheet

All eight panels in a single, printable SVG.

Light SVG Dark SVG

Patch a Client

instructor does not replace your model SDK, it wraps it: instructor.from_provider("openai/gpt-5-nano") builds and patches a client in one call, and the same line with "anthropic/claude-4-5-haiku-latest" gives you the identical instructor surface over a different provider. When you already hold a configured SDK client (custom base_url, Azure, a proxy), wrap it directly with instructor.from_openai(OpenAI()) or instructor.from_anthropic(Anthropic()); either way you get a client whose create() understands response_model=.

instructor patch panel: build and patch from a model string, the same call for an Anthropic model, patch an already-configured OpenAI client, patch an Anthropic client object, an async client, and pick the structured-output mode.

Wrap any OpenAI or Anthropic client; the same instructor surface comes out.

instructor patch panel: build and patch from a model string, the same call for an Anthropic model, patch an already-configured OpenAI client, patch an Anthropic client object, an async client, and pick the structured-output mode.

Wrap any OpenAI or Anthropic client; the same instructor surface comes out.
import instructor

client = instructor.from_provider("openai/gpt-5-nano")               # build + patch (recommended)
client = instructor.from_provider("anthropic/claude-4-5-haiku-latest")  # same surface, Anthropic

from openai import OpenAI
client = instructor.from_openai(OpenAI())          # patch a client you already configured

from anthropic import Anthropic
client = instructor.from_anthropic(Anthropic())    # same, for the Anthropic SDK

client = instructor.from_provider("openai/gpt-5-nano", async_client=True)            # await client.create(...)
client = instructor.from_provider("openai/gpt-5-nano", mode=instructor.Mode.TOOLS)   # use the Mode enum

See Patching. Prefer from_provider; avoid the legacy instructor.patch(client) wrapper.

Define a response_model

The schema you want back is just a pydantic.BaseModel with type-annotated fields, and that is the whole contract: name: str, age: int, optional fields with | None = None, ranges with Field(ge=0, le=130), and fixed choices with Literal[...]. instructor reads this model, sends its schema to the model as a tool/JSON spec, and validates the reply against it, so the class you write is both the prompt and the parser.

instructor response_model panel: declare the shape you want, pass it as response_model, the return value is a real instance, make a field optional, constrain a value with Field, and restrict to a fixed set with Literal.

A plain Pydantic model is the schema you want back.

instructor response_model panel: declare the shape you want, pass it as response_model, the return value is a real instance, make a field optional, constrain a value with Field, and restrict to a fixed set with Literal.

A plain Pydantic model is the schema you want back.
from typing import Literal
from pydantic import BaseModel, Field

class User(BaseModel):
    name: str                                    # declare the shape you want
    age: int = Field(ge=0, le=130)               # constrain a value to a range
    nickname: str | None = None                  # optional, defaults to None
    role: Literal["admin", "user", "guest"]      # one of a fixed set

user = client.create(response_model=User, messages=[...])   # response_model= is the key argument
user.name   # 'Ada'  (a real str, not JSON)
user.age    # 36     (a real int)

See Models. Models are plain pydantic.BaseModel; the pre-1.0 OpenAISchema is gone.

Field Descriptions Guide the Model

Field(description="...") is prompt engineering that lives inside the schema: the description, any examples=, and the model’s own docstring are all sent to the model to steer what each field should contain and how it should be formatted. Constraints (ge, le, Literal) and validator error messages reinforce this, so you shape the output by editing the model, not by lengthening the prompt.

instructor fields panel: describe what a field means, steer formatting, constrain and describe together, steer with examples, a validation-error message guides the re-ask, and a model docstring provides whole-model context.

Field(description=…) is prompt engineering inside the schema.

instructor fields panel: describe what a field means, steer formatting, constrain and describe together, steer with examples, a validation-error message guides the re-ask, and a model docstring provides whole-model context.

Field(description=…) is prompt engineering inside the schema.
from pydantic import BaseModel, Field

class Ticket(BaseModel):
    """A customer support ticket."""                                  # docstring is sent too
    full_name: str = Field(description="first and last name")         # describe a field
    date: str = Field(description="ISO 8601, e.g. 2026-06-18")        # steer the format
    score: int = Field(ge=0, le=100, description="confidence 0 to 100")  # constrain + describe
    owner: str = Field(description="...", examples=["Ada", "Grace"])  # steer with examples

See Prompting. The description, examples, and docstring are all sent to the model.

Nested + List Models

Because the schema is plain Pydantic, models compose: put a User inside an Order, ask for list[Item], or set response_model=list[User] to extract many objects from one prompt. Optional nested blocks (address: Address | None = None) and typed list elements (list[Literal["new", "vip"]]) all validate the same way, so a complex object graph is described once and returned fully typed.

instructor nested panel: nest one model inside another, hold a list of sub-objects, extract a list as the top-level result, an optional nested block, deeply typed list values, and a self-documenting nested schema.

Compose models; ask for many at once.

instructor nested panel: nest one model inside another, hold a list of sub-objects, extract a list as the top-level result, an optional nested block, deeply typed list values, and a self-documenting nested schema.

Compose models; ask for many at once.
from typing import Literal
from pydantic import BaseModel

class Address(BaseModel):
    city: str
    country: str

class Order(BaseModel):
    user: User                                 # nest one model inside another
    items: list[Item]                          # a list of sub-objects
    address: Address | None = None             # optional nested block; absent is fine
    tags: list[Literal["new", "vip"]] = []     # deeply typed list values

orders = client.create(response_model=list[User], messages=[...])   # extract many at once

See Lists and arrays. A complex object graph is described once and returned fully typed.

Extract One Typed Object

Pass response_model=User to client.create(...) (or the OpenAI-shaped client.chat.completions.create(...)) and you get back a real User instance, not a string and not a dict you have to trust. user.name is a str and user.age is an int with IDE autocompletion and type checking; if the model returns something that cannot satisfy the schema, instructor raises a pydantic.ValidationError (and Anthropic models additionally need max_tokens=).

instructor extract panel: extract from a sentence, use the OpenAI-shaped call, add a system instruction, the Anthropic max_tokens requirement, and bad input raising a ValidationError.

Ragged model text in, a validated object out.

instructor extract panel: extract from a sentence, use the OpenAI-shaped call, add a system instruction, the Anthropic max_tokens requirement, and bad input raising a ValidationError.

Ragged model text in, a validated object out.
user = client.create(                          # ragged text in, a validated object out
    response_model=User,
    messages=[{"role": "user", "content": "Ada Lovelace is 36"}],
)   # -> User(name='Ada Lovelace', age=36)

client.chat.completions.create(response_model=User, messages=[...])   # OpenAI-shaped call also works

client.create(                                 # add a system instruction
    response_model=User,
    messages=[{"role": "system", "content": "Extract the user."},
              {"role": "user", "content": text}],
)

client.create(response_model=User, max_tokens=1024, messages=[...])   # max_tokens required for Anthropic
# a reply that cannot satisfy the schema -> raises pydantic.ValidationError

See OpenAI integration. Both create and chat.completions.create accept response_model.

Automatic Retries

When a reply fails validation, instructor does not crash: with max_retries=3 it sends the validation error back to the model and asks again, which is why a field_validator that raises is effectively a natural-language correction loop. Pass an integer for simple cases or a tenacity Retrying(stop=stop_after_attempt(n)) object for full control; if every attempt fails it raises InstructorRetryException.

instructor retries panel: a validation failure triggers a retry, a custom rule that can fail, the model sees the error message, fine-grained retry control with tenacity, and giving up after the budget with InstructorRetryException.

A failed validation is re-asked with the error, not crashed.

instructor retries panel: a validation failure triggers a retry, a custom rule that can fail, the model sees the error message, fine-grained retry control with tenacity, and giving up after the budget with InstructorRetryException.

A failed validation is re-asked with the error, not crashed.
from pydantic import BaseModel, field_validator
from tenacity import Retrying, stop_after_attempt

class User(BaseModel):
    name: str

    @field_validator("name")                   # a custom rule that can fail
    @classmethod
    def caps(cls, v: str) -> str:
        assert v.istitle(), "name must be Title Case"   # message is re-asked to the model
        return v

client.create(response_model=User, max_retries=3, messages=[...])   # re-ask on validation failure

client.create(response_model=User,                                  # fine-grained control via tenacity
              max_retries=Retrying(stop=stop_after_attempt(3)),
              messages=[...])
# every attempt failed -> raises instructor.exceptions.InstructorRetryException

See Retrying. A raising field_validator becomes a natural-language correction loop.

Stream Partial Objects

For responsive UIs, client.create_partial(response_model=User, stream=True) yields the same object repeatedly as its fields fill in, so you can render a form that completes live instead of waiting for the full reply. Use client.create_iterable(...) to stream many objects of the same model one at a time, add PartialLiteralMixin when a model has Literal fields, and the final yielded value is a complete, fully validated object.

instructor streaming panel: stream one object as it fills, consume the partial stream, stream many objects with create_iterable, async streaming, the PartialLiteralMixin for Literal fields, and a complete final frame.

Watch fields fill in live instead of waiting for the whole object.

instructor streaming panel: stream one object as it fills, consume the partial stream, stream many objects with create_iterable, async streaming, the PartialLiteralMixin for Literal fields, and a complete final frame.

Watch fields fill in live instead of waiting for the whole object.
from instructor.dsl.partial import PartialLiteralMixin

stream = client.create_partial(                # stream one object as it fills in
    response_model=User, stream=True, messages=[...],
)
for partial in stream:                         # the same User, more fields each iteration
    print(partial)
# the final yielded value is a complete, fully validated User

for user in client.create_iterable(response_model=User, messages=[...]):   # many of the same model
    print(user)

class User(BaseModel, PartialLiteralMixin):    # needed when a model has Literal fields
    role: Literal["admin", "user"]

See Partial responses. The final yielded value is a complete, validated object.

with_completion (Raw Access)

When you need both the typed object and the underlying provider response, client.create_with_completion(...) returns a (object, completion) tuple so you can read completion.usage for token counts, completion.model, or the raw message, while still getting the validated object you came for. Event hooks via client.on("completion:kwargs", ...) and client.on("completion:error", ...) let you log or trace every call without touching the call sites.

instructor completion panel: get object plus raw completion, read token usage, reach the raw model id and metadata, inspect what the model sent, hook into every call for logging, and confirm the object is unchanged.

Get the typed object AND the raw response together.

instructor completion panel: get object plus raw completion, read token usage, reach the raw model id and metadata, inspect what the model sent, hook into every call for logging, and confirm the object is unchanged.

Get the typed object AND the raw response together.
user, completion = client.create_with_completion(   # get object + raw completion
    response_model=User, messages=[...],
)

completion.usage                  # token counts: prompt, completion, total
completion.model                  # 'gpt-5-nano'
completion.id                     # 'chatcmpl-...'
completion.choices[0].message     # the untouched SDK message object (OpenAI shape)

client.on("completion:kwargs", log_fn)   # hook every call for logging
client.on("completion:error", err_fn)    # hook every failure
# 'user' here is identical to a plain create(); only 'completion' is new

See Raw response. The validated object is unchanged; only the extra completion is new.

Quick Reference

Key instructor calls.
Command What it does Area
instructor.from_provider("openai/gpt-5-nano") Build + patch a client from a model string Patch
instructor.from_openai(OpenAI()) Patch a client object you already configured Patch
instructor.from_anthropic(Anthropic()) Same, for the Anthropic SDK Patch
class User(BaseModel): ... Declare the schema you want back Model
response_model=User Ask for that schema (the key argument) Model
client.create(response_model=User, messages=...) Extract one validated object Extract
response_model=list[User] Extract many objects at once Nested
user: User field Nest a model inside a model Nested
max_retries=3 Re-ask on validation failure Retries
@field_validator("name") Custom rule that can trigger a re-ask Retries
Field(description="...") Steer a field from inside the schema Fields
client.create_partial(..., stream=True) Stream one object filling in Streaming
client.create_iterable(...) Stream many objects Streaming
client.create_with_completion(...) Get (object, raw_completion) Raw
client.on("completion:error", fn) Hook every call for logging Raw
instructor call surfaces.
Call Returns Use when
client.create(response_model=M, ...) a validated M the default, one object back
client.chat.completions.create(response_model=M, ...) a validated M you prefer the OpenAI-shaped path
client.create_partial(response_model=M, stream=True, ...) a generator of partial M streaming a single object live
client.create_iterable(response_model=M, ...) a generator of M streaming many of the same model
client.create_with_completion(response_model=M, ...) tuple (M, completion) you also need token usage / raw response
response_model building blocks.
Field pattern Meaning
name: str Required typed field
nickname: str \| None = None Optional, defaults to None
age: int = Field(ge=0, le=130) Value constrained to a range
role: Literal["a", "b"] One of a fixed set
items: list[Item] A list of nested models
Field(description="...") Guidance sent to the model
@field_validator("x") Custom rule; raising re-asks the model
instructor / provider errors.
Exception Raised when
pydantic.ValidationError A reply does not satisfy the schema and retries are off / exhausted at the field level
instructor.exceptions.InstructorRetryException All max_retries attempts failed validation
provider SDK errors (openai.APIError, anthropic.APIError) The underlying model call failed (network, auth, rate limit)

Appendix: Sample Code

The text to typed object mental model

import instructor
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

client = instructor.from_provider("openai/gpt-5-nano")

user = client.create(
    response_model=User,
    messages=[{"role": "user", "content": "Ada Lovelace is 36 years old"}],
)

user            # User(name='Ada Lovelace', age=36)  <- a real object, not a string
user.name       # 'Ada Lovelace'   (str)
user.age        # 36               (int)

Nested models and a list result

import instructor
from pydantic import BaseModel

class Address(BaseModel):
    city: str
    country: str

class User(BaseModel):
    name: str
    age: int
    address: Address | None = None   # optional nested block

client = instructor.from_provider("openai/gpt-5-nano")

# Extract several users at once: response_model is a list type.
users = client.create(
    response_model=list[User],
    messages=[{
        "role": "user",
        "content": "Ada is 36 in London, UK. Grace is 41 in New York, USA.",
    }],
)
for u in users:
    print(u.name, u.age, u.address)

Field descriptions plus a validator-driven retry

This is the pattern to copy when the model keeps getting a field slightly wrong: describe it, constrain it, and let a failed validation re-ask the model automatically.

import instructor
from pydantic import BaseModel, Field, field_validator

class Contact(BaseModel):
    full_name: str = Field(description="first and last name, Title Case")
    email: str = Field(description="a valid email address")
    confidence: int = Field(ge=0, le=100, description="0 to 100")

    @field_validator("email")
    @classmethod
    def must_be_email(cls, v: str) -> str:
        assert "@" in v, "email must contain @"   # message is re-asked to the model
        return v

client = instructor.from_provider("openai/gpt-5-nano")

contact = client.create(
    response_model=Contact,
    max_retries=3,            # re-ask up to 3 times on a validation failure
    messages=[{"role": "user", "content": "reach ada at ada lovelace gmail"}],
)
print(contact.full_name, contact.email, contact.confidence)

Streaming a single object as it fills in

import instructor
from pydantic import BaseModel

class Report(BaseModel):
    title: str
    summary: str
    score: int

client = instructor.from_provider("openai/gpt-5-nano")

stream = client.create_partial(
    response_model=Report,
    stream=True,
    messages=[{"role": "user", "content": "Summarize the Q2 sales report ..."}],
)

for partial in stream:
    print(partial)   # the same Report, more fields filled each iteration
# The final yielded value is a complete, fully validated Report.

Object plus raw completion (token usage, metadata)

import instructor
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

client = instructor.from_provider("openai/gpt-5-nano")

user, completion = client.create_with_completion(
    response_model=User,
    messages=[{"role": "user", "content": "Ada is 36"}],
)

user                       # User(name='Ada', age=36)   <- validated object
completion.usage           # token counts (prompt / completion / total)
completion.model           # 'gpt-5-nano'
# 'completion' is the untouched provider SDK response object.

The Anthropic path (note max_tokens)

import instructor
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

client = instructor.from_provider("anthropic/claude-4-5-haiku-latest")

user = client.create(
    response_model=User,
    max_tokens=1024,          # required by the Anthropic API
    messages=[{"role": "user", "content": "Ada is 36"}],
)
print(user)

Behavior notes

  • Prefer from_provider. instructor.from_provider("provider/model") builds and patches in one call; reach for from_openai(OpenAI()) / from_anthropic(Anthropic()) only when you already hold a configured SDK client (custom base_url, org, proxy, Azure, Bedrock).
  • Models are plain Pydantic. Avoid the legacy instructor.patch(client) wrapper, the pre-1.0 OpenAISchema base class, and @openai_function decorators; write a pydantic.BaseModel and pass response_model=.
  • Set the mode with the enum. Use mode=instructor.Mode.TOOLS (the default), not a bare string.
  • Anthropic needs max_tokens=. OpenAI calls do not; Anthropic calls require it.
  • A raising validator is a correction loop. With max_retries, a field_validator that raises sends its message back to the model so the re-ask is guided, not random.
  • Streaming has two shapes. create_partial(stream=True) yields one object filling in; create_iterable(...) yields many objects of the same model. Add PartialLiteralMixin to a model with Literal fields so the JSON parser does not error on an incomplete literal mid-stream.

References

instructor documentation (latest)

Provider integrations

Project and related libraries