3.2. Pydantic Schema

Source: https://pydantic-docs.helpmanual.io/usage/schema/

Pydantic allows auto creation of JSON Schemas from models:

from enum import Enum
from pydantic import BaseModel, Field


class FooBar(BaseModel):
    count: int
    size: float = None


class Gender(str, Enum):
    male = 'male'
    female = 'female'
    other = 'other'
    not_given = 'not_given'


class MainModel(BaseModel):
    """
    This is the description of the main model
    """

    foo_bar: FooBar = Field(...)
    gender: Gender = Field(None, alias='Gender')
    snap: int = Field(
        42,
        title='The Snap',
        description='this is the value of snap',
        gt=30,
        lt=50,
    )

    class Config:
        title = 'Main'


# this is equivalent to json.dumps(MainModel.schema(), indent=2):
print(MainModel.schema_json(indent=2))

Outputs:

{
  "title": "Main",
  "description": "This is the description of the main model",
  "type": "object",
  "properties": {
    "foo_bar": {
      "$ref": "#/definitions/FooBar"
    },
    "Gender": {
      "$ref": "#/definitions/Gender"
    },
    "snap": {
      "title": "The Snap",
      "description": "this is the value of snap",
      "default": 42,
      "exclusiveMinimum": 30,
      "exclusiveMaximum": 50,
      "type": "integer"
    }
  },
  "required": [
    "foo_bar"
  ],
  "definitions": {
    "FooBar": {
      "title": "FooBar",
      "type": "object",
      "properties": {
        "count": {
          "title": "Count",
          "type": "integer"
        },
        "size": {
          "title": "Size",
          "type": "number"
        }
      },
      "required": [
        "count"
      ]
    },
    "Gender": {
      "title": "Gender",
      "description": "An enumeration.",
      "enum": [
        "male",
        "female",
        "other",
        "not_given"
      ],
      "type": "string"
    }
  }
}

The generated schemas are compliant with the specifications: JSON Schema Core, JSON Schema Validation and OpenAPI.

BaseModel.schema will return a dict of the schema, while BaseModel.schema_json will return a JSON string representation of that dict.

Sub-models used are added to the definitions JSON attribute and referenced, as per the spec.

All sub-models' (and their sub-models') schemas are put directly in a top-level definitions JSON key for easy re-use and reference.

'Sub-models' with modifications (via the Field class) like a custom title, description or default value, are recursively included instead of referenced.

The description for models is taken from either the docstring of the class or the argument description to the Field class.

The schema is generated by default using aliases as keys, but it can be generated using model property names instead by calling MainModel.schema/schema_json(by_alias=False).

The format of $refs ("#/definitions/FooBar" above) can be altered by calling schema() or schema_json() with the ref_template keyword argument, e.g. ApplePie.schema(ref_template='/schemas/{model}.json#/'), here {model} will be replaced with the model naming using str.format().

3.2.1. Getting schema of a specified type

Pydantic includes two standalone utility functions schema_of and schema_json_of that can be used to apply the schema generation logic used for pydantic models in a more ad-hoc way. These functions behave similarly to BaseModel.schema and BaseModel.schema_json, but work with arbitrary pydantic-compatible types.

from typing import Literal, Annotated
from pydantic import BaseModel, Field, schema_json_of


class Cat(BaseModel):
    pet_type: Literal['cat']
    cat_name: str


class Dog(BaseModel):
    pet_type: Literal['dog']
    dog_name: str


Pet = Annotated[Cat|Dog, Field(discriminator='pet_type')]

print(schema_json_of(Pet, title='The Pet Schema', indent=2))
"""
{
  "title": "The Pet Schema",
  "discriminator": {
    "propertyName": "pet_type",
    "mapping": {
      "cat": "#/definitions/Cat",
      "dog": "#/definitions/Dog"
    }
  },
  "anyOf": [
    {
      "$ref": "#/definitions/Cat"
    },
    {
      "$ref": "#/definitions/Dog"
    }
  ],
  "definitions": {
    "Cat": {
      "title": "Cat",
      "type": "object",
      "properties": {
        "pet_type": {
          "title": "Pet Type",
          "enum": [
            "cat"
          ],
          "type": "string"
        },
        "cat_name": {
          "title": "Cat Name",
          "type": "string"
        }
      },
      "required": [
        "pet_type",
        "cat_name"
      ]
    },
    "Dog": {
      "title": "Dog",
      "type": "object",
      "properties": {
        "pet_type": {
          "title": "Pet Type",
          "enum": [
            "dog"
          ],
          "type": "string"
        },
        "dog_name": {
          "title": "Dog Name",
          "type": "string"
        }
      },
      "required": [
        "pet_type",
        "dog_name"
      ]
    }
  }
}
"""

3.2.2. Field customization

Optionally, the Field function can be used to provide extra information about the field and validations. It has the following arguments:

default: (a positional argument) the default value of the field.
Since the Field replaces the field's default, this first argument can be used to set the default. Use ellipsis (...) to indicate the field is required.
default_factory: a zero-argument callable that will be called
when a default value is needed for this field. Among other purposes, this can be used to set dynamic default values. It is forbidden to set both default and default_factory.
alias: the public name of the field
title: if omitted, field_name.title() is used
description: if omitted and the annotation is a sub-model, the
docstring of the sub-model will be used
exclude: exclude this field when dumping (.dict and
.json) the instance. The exact syntax and configuration options are described in details in the exporting models section.
include: include (only) this field when dumping (.dict and
.json) the instance. The exact syntax and configuration options are described in details in the exporting models section.
const: this argument must be the same as the field's default
value if present.
gt: for numeric values (int, float, Decimal), adds a
validation of 'greater than' and an annotation of exclusiveMinimum to the JSON Schema
ge: for numeric values, this adds a validation of 'greater than
or equal' and an annotation of minimum to the JSON Schema
lt: for numeric values, this adds a validation of 'less than' and
an annotation of exclusiveMaximum to the JSON Schema
le: for numeric values, this adds a validation of 'less than or
equal' and an annotation of maximum to the JSON Schema
multiple_of: for numeric values, this adds a validation of 'a
multiple of' and an annotation of multipleOf to the JSON Schema
max_digits: for Decimal values, this adds a validation to
have a maximum number of digits within the decimal. It does not include a zero before the decimal point or trailing decimal zeroes.
decimal_places: for Decimal values, this adds a validation to
have at most a number of decimal places allowed. It does not include trailing decimal zeroes.
min_items: for list values, this adds a corresponding validation
and an annotation of minItems to the JSON Schema
max_items: for list values, this adds a corresponding validation
and an annotation of maxItems to the JSON Schema
unique_items: for list values, this adds a corresponding
validation and an annotation of uniqueItems to the JSON Schema
min_length: for string values, this adds a corresponding
validation and an annotation of minLength to the JSON Schema
max_length: for string values, this adds a corresponding
validation and an annotation of maxLength to the JSON Schema
allow_mutation: a boolean which defaults to True. When False,
the field raises a TypeError if the field is assigned on an instance. The model config must set validate_assignment to True for this check to be performed.
regex: for string values, this adds a Regular Expression
validation generated from the passed string and an annotation of pattern to the JSON Schema

pydantic validates strings using re.match, which treats regular expressions as implicitly anchored at the beginning. On the contrary, JSON Schema validators treat the pattern keyword as implicitly unanchored, more like what re.search does.

For interoperability, depending on your desired behavior, either explicitly anchor your regular expressions with ^ (e.g. ^foo to match any string starting with foo), or explicitly allow an arbitrary prefix with .*? (e.g. .*?foo to match any string containing the substring foo).

See https://github.com/samuelcolvin/pydantic/issues/1631 for a discussion of possible changes to pydantic behavior in v2.
repr: a boolean which defaults to True. When False, the field
shall be hidden from the object representation.
** any other keyword arguments (e.g. examples) will be added
verbatim to the field's schema

Instead of using Field, the fields property of the Config class can be used to set all of the arguments above except default.

3.2.3. Unenforced Field constraints

If pydantic finds constraints which are not being enforced, an error will be raised. If you want to force the constraint to appear in the schema, even though it's not being checked upon parsing, you can use variadic arguments to Field() with the raw schema attribute name:

from pydantic import BaseModel, Field, PositiveInt

try:
    # this won't work since PositiveInt takes precedence over the
    # constraints defined in Field meaning they're ignored
    class Model(BaseModel):
        foo: PositiveInt = Field(..., lt=10)
except ValueError as e:
    print(e)
    """
    On field "foo" the following field constraints are set but not enforced:
    lt.
    For more details see https://pydantic-
    docs.helpmanual.io/usage/schema/#unenforced-field-constraints
    """


# but you can set the schema attribute directly:
# (Note: here exclusiveMaximum will not be enforce)
class Model(BaseModel):
    foo: PositiveInt = Field(..., exclusiveMaximum=10)


print(Model.schema())
"""
{
    'title': 'Model',
    'type': 'object',
    'properties': {
        'foo': {
            'title': 'Foo',
            'exclusiveMaximum': 10,
            'exclusiveMinimum': 0,
            'type': 'integer',
        },
    },
    'required': ['foo'],
}
"""


# if you find yourself needing this, an alternative is to declare
# the constraints in Field (or you could use conint())
# here both constraints will be enforced:
class Model(BaseModel):
    # Here both constraints will be applied and the schema
    # will be generated correctly
    foo: int = Field(..., gt=0, lt=10)


print(Model.schema())
"""
{
    'title': 'Model',
    'type': 'object',
    'properties': {
        'foo': {
            'title': 'Foo',
            'exclusiveMinimum': 0,
            'exclusiveMaximum': 10,
            'type': 'integer',
        },
    },
    'required': ['foo'],
}
"""

3.2.4. typing.Annotated Fields

Rather than assigning a Field value, it can be specified in the type hint with typing.Annotated:

from uuid import uuid4

from pydantic import BaseModel, Field
from typing_extensions import Annotated


class Foo(BaseModel):
    id: Annotated[str, Field(default_factory=lambda: uuid4().hex)]
    name: Annotated[str, Field(max_length=256)] = 'Bar'

Field can only be supplied once per field - an error will be raised if used in Annotated and as the assigned value. Defaults can be set outside Annotated as the assigned value or with Field.default_factory inside Annotated - the Field.default argument is not supported inside Annotated.

For versions of Python prior to 3.9, typing_extensions.Annotated can be used.

3.2.5. Modifying schema in custom fields

Custom field types can customise the schema generated for them using the __modify_schema__ class method; see Custom Data Types for more details.

__modify_schema__ can also take a field argument which will have type ModelField | None. pydantic will inspect the signature of __modify_schema__ to determine whether the field argument should be included.

from pydantic import BaseModel, Field
from pydantic.fields import ModelField


class RestrictedAlphabetStr(str):
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, value, field: ModelField):
        alphabet = field.field_info.extra['alphabet']
        if any(c not in alphabet for c in value):
            raise ValueError(f'{value!r} is not restricted to {alphabet!r}')
        return cls(value)

    @classmethod
    def __modify_schema__(cls, field_schema, field: ModelField | None):
        if field:
            alphabet = field.field_info.extra['alphabet']
            field_schema['examples'] = [c * 3 for c in alphabet]


class MyModel(BaseModel):
    value: RestrictedAlphabetStr = Field(alphabet='ABC')


print(MyModel.schema_json(indent=2))

Outputs:

{
  "title": "MyModel",
  "type": "object",
  "properties": {
    "value": {
      "title": "Value",
      "alphabet": "ABC",
      "examples": [
        "AAA",
        "BBB",
        "CCC"
      ],
      "type": "string"
    }
  },
  "required": [
    "value"
  ]
}

3.2.6. JSON Schema Types

Types, custom field types, and constraints (like max_length) are mapped to the corresponding spec formats in the following priority order (when there is an equivalent available):

JSON Schema Core
JSON Schema Validation
OpenAPI Data Types
The standard format JSON field is used to define pydantic extensions for more complex string sub-types.

The field schema mapping from Python / pydantic to JSON Schema is done as follows:

Schema mappings

3.2.7. Top-level schema generation

You can also generate a top-level JSON Schema that only includes a list of models and related sub-models in its definitions:

import json
from pydantic import BaseModel
from pydantic.schema import schema

class Foo(BaseModel):
    a: str = None

class Model(BaseModel):
    b: Foo

class Bar(BaseModel):
    c: int

top_level_schema = schema([Model, Bar], title='My Schema')
print(json.dumps(top_level_schema, indent=2))

Outputs:

{
  "title": "My Schema",
  "definitions": {
    "Foo": {
      "title": "Foo",
      "type": "object",
      "properties": {
        "a": {
          "title": "A",
          "type": "string"
        }
      }
    },
    "Model": {
      "title": "Model",
      "type": "object",
      "properties": {
        "b": {
          "$ref": "#/definitions/Foo"
        }
      },
      "required": [
        "b"
      ]
    },
    "Bar": {
      "title": "Bar",
      "type": "object",
      "properties": {
        "c": {
          "title": "C",
          "type": "integer"
        }
      },
      "required": [
        "c"
      ]
    }
  }
}