instructor turns ragged LLM text into validated, typed Python objects. It does not replace your model SDK, it wraps it: you patch any OpenAI or Anthropic client, pass response_model=YourModel, and get back a real Python object, not a string you have to parse and not JSON you have to trust. The recurring mental model in this sheet is one picture: a prompt on the left flows along a gray arrow into a model (the blue create() step), which returns ragged text, and instructor parses and validates that text into a typed object with an accent-green border on the right. Color the object green when it validates, red when a field fails, amber while a retry is in flight. Where this looks like a raw model SDK call, the contrast is the point: the SDK hands you a string, instructor hands you a validated typed object. The conventional imports are import instructor and from pydantic import BaseModel, Field, and everything here reflects the current 2026 API (from_provider, response_model=, the instructor.Mode enum; legacy patch, OpenAISchema, and @openai_function spellings are flagged per section).
Patch a Client
instructor does not replace your model SDK, it wraps it: instructor.from_provider("openai/gpt-5-nano") builds and patches a client in one call, and the same line with "anthropic/claude-4-5-haiku-latest" gives you the identical instructor surface over a different provider. When you already hold a configured SDK client (custom base_url, Azure, a proxy), wrap it directly with instructor.from_openai(OpenAI()) or instructor.from_anthropic(Anthropic()); either way you get a client whose create() understands response_model=.
import instructor
client = instructor.from_provider("openai/gpt-5-nano") # build + patch (recommended)
client = instructor.from_provider("anthropic/claude-4-5-haiku-latest") # same surface, Anthropic
from openai import OpenAI
client = instructor.from_openai(OpenAI()) # patch a client you already configured
from anthropic import Anthropic
client = instructor.from_anthropic(Anthropic()) # same, for the Anthropic SDK
client = instructor.from_provider("openai/gpt-5-nano", async_client=True) # await client.create(...)
client = instructor.from_provider("openai/gpt-5-nano", mode=instructor.Mode.TOOLS) # use the Mode enumSee Patching. Prefer from_provider; avoid the legacy instructor.patch(client) wrapper.
Define a response_model
The schema you want back is just a pydantic.BaseModel with type-annotated fields, and that is the whole contract: name: str, age: int, optional fields with | None = None, ranges with Field(ge=0, le=130), and fixed choices with Literal[...]. instructor reads this model, sends its schema to the model as a tool/JSON spec, and validates the reply against it, so the class you write is both the prompt and the parser.
from typing import Literal
from pydantic import BaseModel, Field
class User(BaseModel):
name: str # declare the shape you want
age: int = Field(ge=0, le=130) # constrain a value to a range
nickname: str | None = None # optional, defaults to None
role: Literal["admin", "user", "guest"] # one of a fixed set
user = client.create(response_model=User, messages=[...]) # response_model= is the key argument
user.name # 'Ada' (a real str, not JSON)
user.age # 36 (a real int)See Models. Models are plain pydantic.BaseModel; the pre-1.0 OpenAISchema is gone.
Field Descriptions Guide the Model
Field(description="...") is prompt engineering that lives inside the schema: the description, any examples=, and the model’s own docstring are all sent to the model to steer what each field should contain and how it should be formatted. Constraints (ge, le, Literal) and validator error messages reinforce this, so you shape the output by editing the model, not by lengthening the prompt.
from pydantic import BaseModel, Field
class Ticket(BaseModel):
"""A customer support ticket.""" # docstring is sent too
full_name: str = Field(description="first and last name") # describe a field
date: str = Field(description="ISO 8601, e.g. 2026-06-18") # steer the format
score: int = Field(ge=0, le=100, description="confidence 0 to 100") # constrain + describe
owner: str = Field(description="...", examples=["Ada", "Grace"]) # steer with examplesSee Prompting. The description, examples, and docstring are all sent to the model.
Nested + List Models
Because the schema is plain Pydantic, models compose: put a User inside an Order, ask for list[Item], or set response_model=list[User] to extract many objects from one prompt. Optional nested blocks (address: Address | None = None) and typed list elements (list[Literal["new", "vip"]]) all validate the same way, so a complex object graph is described once and returned fully typed.
from typing import Literal
from pydantic import BaseModel
class Address(BaseModel):
city: str
country: str
class Order(BaseModel):
user: User # nest one model inside another
items: list[Item] # a list of sub-objects
address: Address | None = None # optional nested block; absent is fine
tags: list[Literal["new", "vip"]] = [] # deeply typed list values
orders = client.create(response_model=list[User], messages=[...]) # extract many at onceSee Lists and arrays. A complex object graph is described once and returned fully typed.
Extract One Typed Object
Pass response_model=User to client.create(...) (or the OpenAI-shaped client.chat.completions.create(...)) and you get back a real User instance, not a string and not a dict you have to trust. user.name is a str and user.age is an int with IDE autocompletion and type checking; if the model returns something that cannot satisfy the schema, instructor raises a pydantic.ValidationError (and Anthropic models additionally need max_tokens=).
user = client.create( # ragged text in, a validated object out
response_model=User,
messages=[{"role": "user", "content": "Ada Lovelace is 36"}],
) # -> User(name='Ada Lovelace', age=36)
client.chat.completions.create(response_model=User, messages=[...]) # OpenAI-shaped call also works
client.create( # add a system instruction
response_model=User,
messages=[{"role": "system", "content": "Extract the user."},
{"role": "user", "content": text}],
)
client.create(response_model=User, max_tokens=1024, messages=[...]) # max_tokens required for Anthropic
# a reply that cannot satisfy the schema -> raises pydantic.ValidationErrorSee OpenAI integration. Both create and chat.completions.create accept response_model.
Automatic Retries
When a reply fails validation, instructor does not crash: with max_retries=3 it sends the validation error back to the model and asks again, which is why a field_validator that raises is effectively a natural-language correction loop. Pass an integer for simple cases or a tenacity Retrying(stop=stop_after_attempt(n)) object for full control; if every attempt fails it raises InstructorRetryException.
from pydantic import BaseModel, field_validator
from tenacity import Retrying, stop_after_attempt
class User(BaseModel):
name: str
@field_validator("name") # a custom rule that can fail
@classmethod
def caps(cls, v: str) -> str:
assert v.istitle(), "name must be Title Case" # message is re-asked to the model
return v
client.create(response_model=User, max_retries=3, messages=[...]) # re-ask on validation failure
client.create(response_model=User, # fine-grained control via tenacity
max_retries=Retrying(stop=stop_after_attempt(3)),
messages=[...])
# every attempt failed -> raises instructor.exceptions.InstructorRetryExceptionSee Retrying. A raising field_validator becomes a natural-language correction loop.
Stream Partial Objects
For responsive UIs, client.create_partial(response_model=User, stream=True) yields the same object repeatedly as its fields fill in, so you can render a form that completes live instead of waiting for the full reply. Use client.create_iterable(...) to stream many objects of the same model one at a time, add PartialLiteralMixin when a model has Literal fields, and the final yielded value is a complete, fully validated object.
from instructor.dsl.partial import PartialLiteralMixin
stream = client.create_partial( # stream one object as it fills in
response_model=User, stream=True, messages=[...],
)
for partial in stream: # the same User, more fields each iteration
print(partial)
# the final yielded value is a complete, fully validated User
for user in client.create_iterable(response_model=User, messages=[...]): # many of the same model
print(user)
class User(BaseModel, PartialLiteralMixin): # needed when a model has Literal fields
role: Literal["admin", "user"]See Partial responses. The final yielded value is a complete, validated object.
with_completion (Raw Access)
When you need both the typed object and the underlying provider response, client.create_with_completion(...) returns a (object, completion) tuple so you can read completion.usage for token counts, completion.model, or the raw message, while still getting the validated object you came for. Event hooks via client.on("completion:kwargs", ...) and client.on("completion:error", ...) let you log or trace every call without touching the call sites.
user, completion = client.create_with_completion( # get object + raw completion
response_model=User, messages=[...],
)
completion.usage # token counts: prompt, completion, total
completion.model # 'gpt-5-nano'
completion.id # 'chatcmpl-...'
completion.choices[0].message # the untouched SDK message object (OpenAI shape)
client.on("completion:kwargs", log_fn) # hook every call for logging
client.on("completion:error", err_fn) # hook every failure
# 'user' here is identical to a plain create(); only 'completion' is newSee Raw response. The validated object is unchanged; only the extra completion is new.
Quick Reference
| Command | What it does | Area |
|---|---|---|
instructor.from_provider("openai/gpt-5-nano") |
Build + patch a client from a model string | Patch |
instructor.from_openai(OpenAI()) |
Patch a client object you already configured | Patch |
instructor.from_anthropic(Anthropic()) |
Same, for the Anthropic SDK | Patch |
class User(BaseModel): ... |
Declare the schema you want back | Model |
response_model=User |
Ask for that schema (the key argument) | Model |
client.create(response_model=User, messages=...) |
Extract one validated object | Extract |
response_model=list[User] |
Extract many objects at once | Nested |
user: User field |
Nest a model inside a model | Nested |
max_retries=3 |
Re-ask on validation failure | Retries |
@field_validator("name") |
Custom rule that can trigger a re-ask | Retries |
Field(description="...") |
Steer a field from inside the schema | Fields |
client.create_partial(..., stream=True) |
Stream one object filling in | Streaming |
client.create_iterable(...) |
Stream many objects | Streaming |
client.create_with_completion(...) |
Get (object, raw_completion) |
Raw |
client.on("completion:error", fn) |
Hook every call for logging | Raw |
| Call | Returns | Use when |
|---|---|---|
client.create(response_model=M, ...) |
a validated M |
the default, one object back |
client.chat.completions.create(response_model=M, ...) |
a validated M |
you prefer the OpenAI-shaped path |
client.create_partial(response_model=M, stream=True, ...) |
a generator of partial M |
streaming a single object live |
client.create_iterable(response_model=M, ...) |
a generator of M |
streaming many of the same model |
client.create_with_completion(response_model=M, ...) |
tuple (M, completion) |
you also need token usage / raw response |
| Field pattern | Meaning |
|---|---|
name: str |
Required typed field |
nickname: str \| None = None |
Optional, defaults to None |
age: int = Field(ge=0, le=130) |
Value constrained to a range |
role: Literal["a", "b"] |
One of a fixed set |
items: list[Item] |
A list of nested models |
Field(description="...") |
Guidance sent to the model |
@field_validator("x") |
Custom rule; raising re-asks the model |
| Exception | Raised when |
|---|---|
pydantic.ValidationError |
A reply does not satisfy the schema and retries are off / exhausted at the field level |
instructor.exceptions.InstructorRetryException |
All max_retries attempts failed validation |
provider SDK errors (openai.APIError, anthropic.APIError) |
The underlying model call failed (network, auth, rate limit) |
Appendix: Sample Code
The text to typed object mental model
import instructor
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
client = instructor.from_provider("openai/gpt-5-nano")
user = client.create(
response_model=User,
messages=[{"role": "user", "content": "Ada Lovelace is 36 years old"}],
)
user # User(name='Ada Lovelace', age=36) <- a real object, not a string
user.name # 'Ada Lovelace' (str)
user.age # 36 (int)Nested models and a list result
import instructor
from pydantic import BaseModel
class Address(BaseModel):
city: str
country: str
class User(BaseModel):
name: str
age: int
address: Address | None = None # optional nested block
client = instructor.from_provider("openai/gpt-5-nano")
# Extract several users at once: response_model is a list type.
users = client.create(
response_model=list[User],
messages=[{
"role": "user",
"content": "Ada is 36 in London, UK. Grace is 41 in New York, USA.",
}],
)
for u in users:
print(u.name, u.age, u.address)Field descriptions plus a validator-driven retry
This is the pattern to copy when the model keeps getting a field slightly wrong: describe it, constrain it, and let a failed validation re-ask the model automatically.
import instructor
from pydantic import BaseModel, Field, field_validator
class Contact(BaseModel):
full_name: str = Field(description="first and last name, Title Case")
email: str = Field(description="a valid email address")
confidence: int = Field(ge=0, le=100, description="0 to 100")
@field_validator("email")
@classmethod
def must_be_email(cls, v: str) -> str:
assert "@" in v, "email must contain @" # message is re-asked to the model
return v
client = instructor.from_provider("openai/gpt-5-nano")
contact = client.create(
response_model=Contact,
max_retries=3, # re-ask up to 3 times on a validation failure
messages=[{"role": "user", "content": "reach ada at ada lovelace gmail"}],
)
print(contact.full_name, contact.email, contact.confidence)Streaming a single object as it fills in
import instructor
from pydantic import BaseModel
class Report(BaseModel):
title: str
summary: str
score: int
client = instructor.from_provider("openai/gpt-5-nano")
stream = client.create_partial(
response_model=Report,
stream=True,
messages=[{"role": "user", "content": "Summarize the Q2 sales report ..."}],
)
for partial in stream:
print(partial) # the same Report, more fields filled each iteration
# The final yielded value is a complete, fully validated Report.Object plus raw completion (token usage, metadata)
import instructor
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
client = instructor.from_provider("openai/gpt-5-nano")
user, completion = client.create_with_completion(
response_model=User,
messages=[{"role": "user", "content": "Ada is 36"}],
)
user # User(name='Ada', age=36) <- validated object
completion.usage # token counts (prompt / completion / total)
completion.model # 'gpt-5-nano'
# 'completion' is the untouched provider SDK response object.The Anthropic path (note max_tokens)
import instructor
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
client = instructor.from_provider("anthropic/claude-4-5-haiku-latest")
user = client.create(
response_model=User,
max_tokens=1024, # required by the Anthropic API
messages=[{"role": "user", "content": "Ada is 36"}],
)
print(user)Behavior notes
- Prefer
from_provider.instructor.from_provider("provider/model")builds and patches in one call; reach forfrom_openai(OpenAI())/from_anthropic(Anthropic())only when you already hold a configured SDK client (custombase_url, org, proxy, Azure, Bedrock). - Models are plain Pydantic. Avoid the legacy
instructor.patch(client)wrapper, the pre-1.0OpenAISchemabase class, and@openai_functiondecorators; write apydantic.BaseModeland passresponse_model=. - Set the mode with the enum. Use
mode=instructor.Mode.TOOLS(the default), not a bare string. - Anthropic needs
max_tokens=. OpenAI calls do not; Anthropic calls require it. - A raising validator is a correction loop. With
max_retries, afield_validatorthat raises sends its message back to the model so the re-ask is guided, not random. - Streaming has two shapes.
create_partial(stream=True)yields one object filling in;create_iterable(...)yields many objects of the same model. AddPartialLiteralMixinto a model withLiteralfields so the JSON parser does not error on an incomplete literal mid-stream.
References
instructor documentation (latest)
- Documentation home and Getting started
- Concepts: Models, Patching, Prompting
- Lists and arrays, Retrying, Partial responses, Raw response
- API reference, Hooks, Iterable streaming
Provider integrations
Project and related libraries