Schema Enforcement (JSON-First)

This page shows how to enforce structured outputs using JSON Schema across providers, with optional Pydantic helpers for validation and retries.

Core Concepts

  • JSON-First: Supply a JSON Schema dict directly. Pydantic models are optional helpers (use MyModel.model_json_schema()).

  • Provider-Aware:

    • OpenAI: Uses native strict response_format.

    • Anthropic: Converts the schema to a single enforced tool call.

    • Other providers: Prompt hint plus post-validation fallback.

  • Validation & Retry: Responses are parsed and validated against a derived Pydantic model with lightweight retries; hard failures raise SchemaValidationException.

  • Runtime Model Generation: SchemaManager loads schemas from dict/file/URL, generates Pydantic models on the fly, and caches them.

Quick Example (JSON-First)

from cellsem_llm_client.agents import LiteLLMAgent

schema = {
    "type": "object",
    "properties": {
        "term": {"type": "string", "description": "Cell type name"},
        "iri": {"type": "string", "format": "uri"},
    },
    "required": ["term", "iri"],
    "additionalProperties": False,
}

agent = LiteLLMAgent(model="gpt-4o", api_key="your-key")
result = agent.query_with_schema(
    message="Return a cell type name and IRI.",
    schema=schema,  # JSON-first
)

print(result.model_dump())  # Pydantic model generated at runtime

Schema Inputs

  • JSON Schema dict (preferred): Pass directly via schema=....

  • Pydantic model: Pass the class or model_json_schema(); the schema is derived for enforcement.

  • Schema name: Place <name>.json in your schema directory and use schema="name"; SchemaManager will load, validate, and cache it.

Under the Hood

  • SchemaManager: Loads schemas (dict/file/URL) and generates Pydantic models.

  • SchemaAdapterFactory: Picks a provider adapter (OpenAI strict, Anthropic tool, fallback prompt hint).

  • SchemaValidator: Parses/validates responses with retries; raises SchemaValidationException on exhaustion.

Notes

  • For OpenAI strict mode, the schema is tightened (e.g., additionalProperties=False) for better enforcement.

  • Anthropic responses are returned from the first enforced tool call’s JSON arguments.

  • You can add custom schema directories by configuring SchemaManager if needed.