Using Structured Output With Llms Matías Battaglia
Structured output from LLMs is critical for production applications, yet it’s one of those topics that seems simple in proof-of-concept mode but suddenly becomes a problem at scale. Let’s explore different approaches to reliably getting structured output from LLMs. By this I mean defining a JSON model or schema for how outputs from LLMs should look like, and then coercing the LLM to follow it. It also covers the mechanisms you can use to validate whether the LLM did it, and even casting them if needed. You can also view it as applying strong typing to LLM outputs. The short answer: downstream integrations.
If you are planning to integrate the LLM’s output with any other system, most likely you will want to have control over the structure and the fields contained in the output. After discussing this with many customers, I wrote a short guide on strategies for reliably getting structured outputs from LLMs. Without proper structure, LLMs might give you perfect JSON 99% of the time - but that 1% failure rate will break your integrations. The three approaches I cover: - Native APIs: the most reliable for the providers that support it; it requires the least amount of work. - Function calling: works in any provider/model combo that supports tool use; almost as reliable as native APIs; but needs a bit of extra work. - Manual implementation: works across all models and providers; needs prompting and custom validation code.
My guide includes practical code examples and a decision framework for choosing the right approach. Link in the comments! Link to the post: https://matiasbattaglia.com/2025/09/11/Using-Structured-Outputs-with-LLMs.html Get reliable JSON from any LLM using structured outputs, JSON mode, Pydantic, Instructor, and Outlines. Complete production guide with OpenAI, Claude, and Gemini code examples for consistent data extraction. Get reliable JSON from any LLM using structured outputs, JSON mode, Pydantic, Instructor, and Outlines.
Complete production guide with OpenAI, Claude, and Gemini code examples for consistent data extraction. Agenta is the open-source LLMOps platform: prompt management, evals, and LLM observability all in one place. LLMs excel at creative tasks. They write code, summarize documents, and draft emails with impressive results. But ask for structured JSON and you get inconsistent formats, malformed syntax, and unpredictable field names. The problem gets worse in production.
A prompt that works perfectly in testing starts failing after a model update. Your JSON parser breaks on unexpected field types. Your application crashes because the LLM decided to rename "status" to "current_state" without warning. In the rapidly evolving world of Large Language Models (LLMs), getting consistent, predictable results can be challenging. While LLMs excel at generating human-like text, they often return information in free-form formats that require additional parsing and processing. This is where structured output comes in — a powerful technique that constrains LLM responses to predefined formats, making them more reliable and immediately useful for downstream applications.
In this article, we’ll explore three practical methods for implementing structured output with LLMs using LangChain and examine why this approach is becoming essential for production-ready AI applications. Before diving into implementation details, let’s understand why structured output is so valuable: Without structured output, you might get a helpful response, but extracting specific pieces of information becomes a manual task. Structured output transforms LLMs from conversational tools into reliable data providers. Pydantic provides a clean, Pythonic way to define data structures with built-in validation. When combined with LangChain, it creates a powerful mechanism for structured LLM outputs.
A guide — and prototype — for getting clean data out of PDFs Journalists get PDFs as responses to FOIA requests, though document dumps and via white papers. To make use of PDFs, these journalists need to get data out of documents and into an analysis-friendly format, like a spreadsheet. The process can involve laborious, manual transcription or copying and pasting data from one format to another. Theoretically, large language models can assist with document processing, but risks like hallucinations and the inherent uncertainty of LLM outputs make this approach tricky. Journalists need to be certain the output actually contains the needed data, follows the needed data types and is in a usable format.
Structured outputs offer a solution to these challenges. Providers like Anthropic and OpenAI, and open source libraries like Outlines allow developers to define strict schemas that constrain LLM responses to specific fields, data types, and formats. Structured outputs transform raw LLM capabilities into reliable data processing pipelines. When extracting tables from multi-page PDFs, for instance, a schema ensures consistent column names and data types across pages. While this approach cannot guarantee perfect accuracy, it reduces the engineering complexity of parsing and validating LLM responses, making document processing workflows both more reliable and more maintainable. An overview of popular techniques to confine LLMs' output to a predefined schema
Today, the most common interface for interacting with LLMs is through the classic chat UI found in ChatGPT, Gemini, or DeepSeek. The interface is quite simple, where the user inputs a body of text and the model responds with another body, which may or may not follow a specific structure. Since humans can understand unstructured natural language, this interface is suitable and quite effective for the target audience it was designed for. However, the user base of LLMs is much larger than the 8 billion humans living on Earth. It expands to millions of software programs that can potentially harness the power of such large generative models. Unlike humans, software programs cannot understand unstructured data, preventing them from exploiting the knowledge generated by these neural networks.
To address this issue, various techniques have been developed to generate outputs from LLMs following a predefined schema. This article will overview three of the most popular approaches for producing structured outputs from LLMs. It is written for engineers interested in integrating LLMs into their software applications. Structured output generation from LLMs involves using these models to produce data that adheres to a predefined schema, rather than generating unstructured text. The schema can be defined in various formats, with JSON and regex being the most common. For example, when utilizing JSON format, the schema specifies the expected keys and the data types (such as int, string, float, etc.) for each value.
The LLM then outputs a JSON object that includes only the defined keys and correctly formatted values. Chapter 2: “Structured Output” of the book Taming LLMs is now available for review. Visit github repo to access Chapter in the following formats: The pdf format is recommended as it contains the highest quality copy. Please share feedback via one of the following: Send a message via substack, linkedin or twitter
Over the past weeks, I’ve tested various methods to get LLMs to produce consistent, well-structured responses—an essential step when integrating these models into programmatic solutions, where free-form text can break automated workflows and data... Below are two approaches that helped me solve these problems: prompt engineering and Pydantic schema validation. Converting raw LLM text into predictable formats like JSON streamlines integration with APIs, databases, and UIs. Without structure, even extracting a single field can become error-prone; for instance, {"date": "2024-10-17"} is much easier to process than “Next Thursday.” This approach relies on explicit instructions in the prompt to guide the model’s output format. Using OpenAI’s SDK with Pydantic, you can define strict schemas and parse responses directly into Python objects.
Validation as a Guardrail In one text-classification workflow, the LLM sometimes returned incomplete fields for category tags. By defining a CategoryItem(BaseModel) with the required keys, Pydantic immediately caught missing tags and enforced type correctness (e.g., tags: list[str]). This quick feedback loop saved a lot of time troubleshooting downstream logic. Have you ever faced issues with insufficient control over your large language model's output when developing applications? Often, models are not configured to produce results tailored to their specific needs, leading to outputs that are less useful or relevant. This oversight is a recurring issue in many projects.
Developers frequently neglect to design their prompts in a way that guides the model to generate outputs that can be practically applied to their use case. This misstep can result in inefficiencies and additional work to manually adjust or interpret the model’s responses. One could easily solve this problem by specifying the output format in the prompt. One should be precise what output is required and its structure. Passing the structure of the output helps a lot in such situation. Langchain addresses this problem by offering a comprehensive suite of tools known as Output Parsers.
These tools are specifically designed to help developers manage and format the model's output effectively. Output parsers enable precise control over the structure and content of the responses generated by the language model, ensuring that the outputs align closely with the intended application. By utilizing output parsers, developers can streamline their workflows, reduce errors, and enhance the overall usefulness of their applications. This makes Langchain a powerful resource for anyone looking to harness the full potential of large language models in a controlled and productive manner. Lets get started with techniques to address this issue effectively: In this technique, we pass the structure of the output we expect LLM to give.
This is simply passed in prompt.
People Also Search
- Using Structured Output with LLMs | Matías Battaglia
- Matias Battaglia Romano's Post - LinkedIn
- The guide to structured outputs and function calling with LLMs
- Structured Output in LLMs: Why It Matters and How to Implement It
- Structured Outputs: Making LLMs Reliable for Document Processing
- Generating Structured Outputs from LLMs - Towards Data Science
- Taming LLMs - Chapter 2: Structured Output - Taming LLMs
- Latest Posts | Matías Battaglia
- A Practical Look at Generating Structured Outputs with LLMs
- Structured output while working with LLMs - SAP Community
Structured Output From LLMs Is Critical For Production Applications, Yet
Structured output from LLMs is critical for production applications, yet it’s one of those topics that seems simple in proof-of-concept mode but suddenly becomes a problem at scale. Let’s explore different approaches to reliably getting structured output from LLMs. By this I mean defining a JSON model or schema for how outputs from LLMs should look like, and then coercing the LLM to follow it. It ...
If You Are Planning To Integrate The LLM’s Output With
If you are planning to integrate the LLM’s output with any other system, most likely you will want to have control over the structure and the fields contained in the output. After discussing this with many customers, I wrote a short guide on strategies for reliably getting structured outputs from LLMs. Without proper structure, LLMs might give you perfect JSON 99% of the time - but that 1% failure...
My Guide Includes Practical Code Examples And A Decision Framework
My guide includes practical code examples and a decision framework for choosing the right approach. Link in the comments! Link to the post: https://matiasbattaglia.com/2025/09/11/Using-Structured-Outputs-with-LLMs.html Get reliable JSON from any LLM using structured outputs, JSON mode, Pydantic, Instructor, and Outlines. Complete production guide with OpenAI, Claude, and Gemini code examples for c...
Complete Production Guide With OpenAI, Claude, And Gemini Code Examples
Complete production guide with OpenAI, Claude, and Gemini code examples for consistent data extraction. Agenta is the open-source LLMOps platform: prompt management, evals, and LLM observability all in one place. LLMs excel at creative tasks. They write code, summarize documents, and draft emails with impressive results. But ask for structured JSON and you get inconsistent formats, malformed synta...
A Prompt That Works Perfectly In Testing Starts Failing After
A prompt that works perfectly in testing starts failing after a model update. Your JSON parser breaks on unexpected field types. Your application crashes because the LLM decided to rename "status" to "current_state" without warning. In the rapidly evolving world of Large Language Models (LLMs), getting consistent, predictable results can be challenging. While LLMs excel at generating human-like te...