3.2. Pydantic Schema
Pydantic
allows auto creation of JSON Schemas from models:
from enum import Enum
from pydantic import BaseModel, Field
class FooBar(BaseModel):
count: int
size: float = None
class Gender(str, Enum):
male = 'male'
female = 'female'
other = 'other'
not_given = 'not_given'
class MainModel(BaseModel):
"""
This is the description of the main model
"""
foo_bar: FooBar = Field(...)
gender: Gender = Field(None, alias='Gender')
snap: int = Field(
42,
title='The Snap',
description='this is the value of snap',
gt=30,
lt=50,
)
class Config:
title = 'Main'
# this is equivalent to json.dumps(MainModel.schema(), indent=2):
print(MainModel.schema_json(indent=2))
Outputs:
{
"title": "Main",
"description": "This is the description of the main model",
"type": "object",
"properties": {
"foo_bar": {
"$ref": "#/definitions/FooBar"
},
"Gender": {
"$ref": "#/definitions/Gender"
},
"snap": {
"title": "The Snap",
"description": "this is the value of snap",
"default": 42,
"exclusiveMinimum": 30,
"exclusiveMaximum": 50,
"type": "integer"
}
},
"required": [
"foo_bar"
],
"definitions": {
"FooBar": {
"title": "FooBar",
"type": "object",
"properties": {
"count": {
"title": "Count",
"type": "integer"
},
"size": {
"title": "Size",
"type": "number"
}
},
"required": [
"count"
]
},
"Gender": {
"title": "Gender",
"description": "An enumeration.",
"enum": [
"male",
"female",
"other",
"not_given"
],
"type": "string"
}
}
}
The generated schemas are compliant with the specifications: JSON Schema Core, JSON Schema Validation and OpenAPI.
BaseModel.schema
will return a dict of the schema, while
BaseModel.schema_json
will return a JSON string representation of
that dict.
Sub-models used are added to the definitions
JSON attribute and
referenced, as per the spec.
All sub-models' (and their sub-models') schemas are put directly in a
top-level definitions
JSON key for easy re-use and reference.
'Sub-models' with modifications (via the Field
class) like a custom
title, description or default value, are recursively included instead of
referenced.
The description
for models is taken from either the docstring of the
class or the argument description
to the Field
class.
The schema is generated by default using aliases as keys, but it can be
generated using model property names instead by calling
MainModel.schema/schema_json(by_alias=False)
.
The format of $ref
s ("#/definitions/FooBar"
above) can be
altered by calling schema()
or schema_json()
with the
ref_template
keyword argument,
e.g. ApplePie.schema(ref_template='/schemas/{model}.json#/')
, here
{model}
will be replaced with the model naming using
str.format()
.
3.2.1. Getting schema of a specified type
Pydantic
includes two standalone utility functions schema_of
and
schema_json_of
that can be used to apply the schema generation logic
used for pydantic
models in a more ad-hoc way. These functions behave
similarly to BaseModel.schema
and BaseModel.schema_json
, but
work with arbitrary pydantic-compatible types.
from typing import Literal, Annotated
from pydantic import BaseModel, Field, schema_json_of
class Cat(BaseModel):
pet_type: Literal['cat']
cat_name: str
class Dog(BaseModel):
pet_type: Literal['dog']
dog_name: str
Pet = Annotated[Cat|Dog, Field(discriminator='pet_type')]
print(schema_json_of(Pet, title='The Pet Schema', indent=2))
"""
{
"title": "The Pet Schema",
"discriminator": {
"propertyName": "pet_type",
"mapping": {
"cat": "#/definitions/Cat",
"dog": "#/definitions/Dog"
}
},
"anyOf": [
{
"$ref": "#/definitions/Cat"
},
{
"$ref": "#/definitions/Dog"
}
],
"definitions": {
"Cat": {
"title": "Cat",
"type": "object",
"properties": {
"pet_type": {
"title": "Pet Type",
"enum": [
"cat"
],
"type": "string"
},
"cat_name": {
"title": "Cat Name",
"type": "string"
}
},
"required": [
"pet_type",
"cat_name"
]
},
"Dog": {
"title": "Dog",
"type": "object",
"properties": {
"pet_type": {
"title": "Pet Type",
"enum": [
"dog"
],
"type": "string"
},
"dog_name": {
"title": "Dog Name",
"type": "string"
}
},
"required": [
"pet_type",
"dog_name"
]
}
}
}
"""
3.2.2. Field customization
Optionally, the Field
function can be used to provide extra
information about the field and validations. It has the following
arguments:
default
: (a positional argument) the default value of the field.Since the
Field
replaces the field's default, this first argument can be used to set the default. Use ellipsis (...
) to indicate the field is required.
default_factory
: a zero-argument callable that will be calledwhen a default value is needed for this field. Among other purposes, this can be used to set dynamic default values. It is forbidden to set both
default
anddefault_factory
.
alias
: the public name of the fieldtitle
: if omitted,field_name.title()
is useddescription
: if omitted and the annotation is a sub-model, thedocstring of the sub-model will be used
exclude
: exclude this field when dumping (.dict
and.json
) the instance. The exact syntax and configuration options are described in details in the exporting models section.
include
: include (only) this field when dumping (.dict
and.json
) the instance. The exact syntax and configuration options are described in details in the exporting models section.
const
: this argument must be the same as the field's defaultvalue if present.
gt
: for numeric values (int
,float
,Decimal
), adds avalidation of 'greater than' and an annotation of
exclusiveMinimum
to the JSON Schema
ge
: for numeric values, this adds a validation of 'greater thanor equal' and an annotation of
minimum
to the JSON Schema
lt
: for numeric values, this adds a validation of 'less than' andan annotation of
exclusiveMaximum
to the JSON Schema
le
: for numeric values, this adds a validation of 'less than orequal' and an annotation of
maximum
to the JSON Schema
multiple_of
: for numeric values, this adds a validation of 'amultiple of' and an annotation of
multipleOf
to the JSON Schema
max_digits
: forDecimal
values, this adds a validation tohave a maximum number of digits within the decimal. It does not include a zero before the decimal point or trailing decimal zeroes.
decimal_places
: forDecimal
values, this adds a validation tohave at most a number of decimal places allowed. It does not include trailing decimal zeroes.
min_items
: for list values, this adds a corresponding validationand an annotation of
minItems
to the JSON Schema
max_items
: for list values, this adds a corresponding validationand an annotation of
maxItems
to the JSON Schema
unique_items
: for list values, this adds a correspondingvalidation and an annotation of
uniqueItems
to the JSON Schema
min_length
: for string values, this adds a correspondingvalidation and an annotation of
minLength
to the JSON Schema
max_length
: for string values, this adds a correspondingvalidation and an annotation of
maxLength
to the JSON Schema
allow_mutation
: a boolean which defaults toTrue
. When False,the field raises a
TypeError
if the field is assigned on an instance. The model config must setvalidate_assignment
toTrue
for this check to be performed.
regex
: for string values, this adds a Regular Expressionvalidation generated from the passed string and an annotation of
pattern
to the JSON Schemapydantic
validates strings usingre.match
, which treats regular expressions as implicitly anchored at the beginning. On the contrary, JSON Schema validators treat thepattern
keyword as implicitly unanchored, more like whatre.search
does.
For interoperability, depending on your desired behavior, either explicitly anchor your regular expressions with ^ (e.g. ^foo to match any string starting with foo), or explicitly allow an arbitrary prefix with .*? (e.g. .*?foo to match any string containing the substring foo).
See https://github.com/samuelcolvin/pydantic/issues/1631 for a discussion of possible changes to
pydantic
behavior in v2.repr
: a boolean which defaults toTrue
. When False, the fieldshall be hidden from the object representation.
**
any other keyword arguments (e.g.examples
) will be addedverbatim to the field's schema
Instead of using Field
, the fields
property of the Config
class can be used to set all of the arguments above
except default
.
3.2.3. Unenforced Field constraints
If pydantic
finds constraints which are not being enforced, an error
will be raised. If you want to force the constraint to appear in the
schema, even though it's not being checked upon parsing, you can use
variadic arguments to Field()
with the raw schema attribute name:
from pydantic import BaseModel, Field, PositiveInt
try:
# this won't work since PositiveInt takes precedence over the
# constraints defined in Field meaning they're ignored
class Model(BaseModel):
foo: PositiveInt = Field(..., lt=10)
except ValueError as e:
print(e)
"""
On field "foo" the following field constraints are set but not enforced:
lt.
For more details see https://pydantic-
docs.helpmanual.io/usage/schema/#unenforced-field-constraints
"""
# but you can set the schema attribute directly:
# (Note: here exclusiveMaximum will not be enforce)
class Model(BaseModel):
foo: PositiveInt = Field(..., exclusiveMaximum=10)
print(Model.schema())
"""
{
'title': 'Model',
'type': 'object',
'properties': {
'foo': {
'title': 'Foo',
'exclusiveMaximum': 10,
'exclusiveMinimum': 0,
'type': 'integer',
},
},
'required': ['foo'],
}
"""
# if you find yourself needing this, an alternative is to declare
# the constraints in Field (or you could use conint())
# here both constraints will be enforced:
class Model(BaseModel):
# Here both constraints will be applied and the schema
# will be generated correctly
foo: int = Field(..., gt=0, lt=10)
print(Model.schema())
"""
{
'title': 'Model',
'type': 'object',
'properties': {
'foo': {
'title': 'Foo',
'exclusiveMinimum': 0,
'exclusiveMaximum': 10,
'type': 'integer',
},
},
'required': ['foo'],
}
"""
3.2.4. typing.Annotated Fields
Rather than assigning a Field
value, it can be specified in the type
hint with typing.Annotated
:
from uuid import uuid4
from pydantic import BaseModel, Field
from typing_extensions import Annotated
class Foo(BaseModel):
id: Annotated[str, Field(default_factory=lambda: uuid4().hex)]
name: Annotated[str, Field(max_length=256)] = 'Bar'
Field
can only be supplied once per field - an error will be raised
if used in Annotated
and as the assigned value. Defaults can be set
outside Annotated
as the assigned value or with
Field.default_factory
inside Annotated
- the Field.default
argument is not supported inside Annotated
.
For versions of Python prior to 3.9, typing_extensions.Annotated
can
be used.
3.2.5. Modifying schema in custom fields
Custom field types can customise the schema generated for them using the
__modify_schema__
class method; see Custom Data
Types for more details.
__modify_schema__
can also take a field
argument which will have
type ModelField | None
. pydantic
will inspect the signature of
__modify_schema__
to determine whether the field
argument should
be included.
from pydantic import BaseModel, Field
from pydantic.fields import ModelField
class RestrictedAlphabetStr(str):
@classmethod
def __get_validators__(cls):
yield cls.validate
@classmethod
def validate(cls, value, field: ModelField):
alphabet = field.field_info.extra['alphabet']
if any(c not in alphabet for c in value):
raise ValueError(f'{value!r} is not restricted to {alphabet!r}')
return cls(value)
@classmethod
def __modify_schema__(cls, field_schema, field: ModelField | None):
if field:
alphabet = field.field_info.extra['alphabet']
field_schema['examples'] = [c * 3 for c in alphabet]
class MyModel(BaseModel):
value: RestrictedAlphabetStr = Field(alphabet='ABC')
print(MyModel.schema_json(indent=2))
Outputs:
{
"title": "MyModel",
"type": "object",
"properties": {
"value": {
"title": "Value",
"alphabet": "ABC",
"examples": [
"AAA",
"BBB",
"CCC"
],
"type": "string"
}
},
"required": [
"value"
]
}
3.2.6. JSON Schema Types
Types, custom field types, and constraints (like max_length
) are
mapped to the corresponding spec formats in the following priority order
(when there is an equivalent available):
The standard
format
JSON field is used to definepydantic
extensions for more complexstring
sub-types.
The field schema mapping from Python / pydantic
to JSON Schema is done
as follows:
3.2.7. Top-level schema generation
You can also generate a top-level JSON Schema that only includes a list
of models and related sub-models in its definitions
:
import json
from pydantic import BaseModel
from pydantic.schema import schema
class Foo(BaseModel):
a: str = None
class Model(BaseModel):
b: Foo
class Bar(BaseModel):
c: int
top_level_schema = schema([Model, Bar], title='My Schema')
print(json.dumps(top_level_schema, indent=2))
Outputs:
{
"title": "My Schema",
"definitions": {
"Foo": {
"title": "Foo",
"type": "object",
"properties": {
"a": {
"title": "A",
"type": "string"
}
}
},
"Model": {
"title": "Model",
"type": "object",
"properties": {
"b": {
"$ref": "#/definitions/Foo"
}
},
"required": [
"b"
]
},
"Bar": {
"title": "Bar",
"type": "object",
"properties": {
"c": {
"title": "C",
"type": "integer"
}
},
"required": [
"c"
]
}
}
}