Taming Llm Outputs Your Guide To Structured Text By Medium

Emily Johnson

-Mar 12, 2026, 6:46 PM

taming llm outputs your guide to structured text by medium

Large language models (LLMs) are like wild animals — powerful and versatile, but unpredictable and potentially dangerous. This makes deploying robust LLM applications challenging. In this blog post, we present the notion of structured text generation, which enables practitioners to “tame” LLMs by imposing formatting constraints on their outputs. More precisely, we will: Structured text generation methods are available for four main categories of formatting constraints. The simplest one is restricting the LLM’s outputs to a predefined set of options.

For example, when implementing an LLM-as-a-judge approach, we may want to generate a score from 1 to 5, in which case we would expect only five answers: “1”, “2”, “3”, “4”, and “5”. More general constraints can be expressed through regular expressions. The most typical example is the generation of a JSON object adhering to a specific JSON schema. For example, if we perform sentiment analysis on online reviews, the expected LLM response may be a JSON object with two properties: “sentiment” which is a string with three potential values (“positive”, “negative”, and... An even wider family of constraints is based on formal grammars, which are particularly interesting when we want to obtain syntactically correct computer code through an LLM. Finally, formatting constraints can take the form of templates, which are dynamic, fill-in-the-blank texts whose placeholders are meant to be filled by an LLM.

Posted on Jul 11, 2025 • Edited on Mar 7 Hello, I'm Shrijith. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.

Let me know if you have any other text you'd like me to clean!uctured output—like JSON, specific types, or regex-compliant text—can feel like herding cats. Tools like Outlines make this easier by guaranteeing structured output directly during generation, even for large, multi-part responses. This post dives into how Outlines works, why it’s a game-changer for developers, and how you can use it to avoid parsing nightmares. We’ll explore code examples, key concepts, and practical tips to make your LLM projects more reliable. LLMs often generate freeform text, which is great for creative tasks but a headache when you need structured data like JSON, integers, or specific formats. Parsing raw LLM output is error-prone—think broken JSON, inconsistent formats, or extra fluff.

Outlines solves this by enforcing structure at the generation step, not after. This means: This approach is perfect for tasks like API response formatting, customer support ticket parsing, or extracting structured data from text. Let’s break down how it works. In limits, there is freedom. Creativity thrives within structure.

While Language Models excel at generating human-like text, they face challenges when tasked with producing structured output in a consistent manner [Shorten et al., 2024, Tang et al., 2024]. This limitation becomes particularly problematic when integrating LLMs into production systems that require well-formatted data for downstream processing through databases, APIs, or other software applications. Even carefully crafted prompts cannot guarantee that an LLM will maintain the expected structure throughout its response. But what user needs drive the demand for LLM output constraints? In a recent work by Google Research [Liu et al., 2024], the authors explored the user need for constraints on the output of large language models, drawing on a survey of 51 industry professionals... User needs can be broadly categorized as follows:

1. Improving Developer Efficiency and Workflow Reducing Trial and Error in Prompt Engineering: Developers find the process of crafting prompts to elicit desired output formats to be time-consuming, often involving extensive testing and iteration. LLM output constraints could make this process more efficient and predictable. Chapter 2: “Structured Output” of the book Taming LLMs is now available for review. Visit github repo to access Chapter in the following formats:

The pdf format is recommended as it contains the highest quality copy. Please share feedback via one of the following: Send a message via substack, linkedin or twitter Large language models (LLMs) are like wild animals — powerful and versatile, but unpredictable and potentially dangerous. This makes deploying robust LLM applications challenging. In this blog post, we present the notion of structured text generation, which enables practitioners to “tame” LLMs by imposing formatting constraints on their outputs.

More precisely, we will: Structured text generation methods are available for four main categories of formatting constraints. The simplest one is restricting the LLM’s outputs to a predefined set of options. For example, when implementing an LLM-as-a-judge approach, we may want to generate a score from 1 to 5, in which case we would expect only five answers: “1”, “2”, “3”, “4”, and “5”. More general constraints can be expressed through regular expressions. The most typical example is the generation of a JSON object adhering to a specific JSON schema.

For example, if we perform sentiment analysis on online reviews, the expected LLM response may be a JSON object with two properties: “sentiment” which is a string with three potential values (“positive”, “negative”, and... An even wider family of constraints is based on formal grammars, which are particularly interesting when we want to obtain syntactically correct computer code through an LLM. Finally, formatting constraints can take the form of templates, which are dynamic, fill-in-the-blank texts whose placeholders are meant to be filled by an LLM. About LLM structured output, it seems like everything has been said already. We’ve got JSON schemas, Pydantic models, and enough Medium articles to fill a small library. Yet here’s what rarely gets mentioned at AI conferences (where a MacBook plastered with brain illustration stickers is practically the entry fee): you can use examples to force LLMs to shape their responses to...

Few-shot examples represent a powerful alternative to model fine-tuning that eliminates the need for parameter optimization, dataset curation, and computational overhead. They provide immediate control over model behavior without requiring infrastructure changes or additional training cycles. Examples serve as behavioral constraints that guide the model toward desired output patterns while maintaining inference efficiency. This approach proves particularly valuable in production environments where consistency and predictability are essential — because nothing ruins your day quite like an AI that decides to get creative with your API responses. It’s not only about the right format, what if you can show the model how you expect the reasoning behind the output (like chain-of-thought)? Instead of just demanding the right JSON structure, you’re essentially saying, “Hey, I want to see your mental math too.” It’s like having a transparent AI that can’t help but explain how it arrived...

Field-like examples provide granular control over individual output components. Complete object examples demonstrate full-context relationships between fields and establish structural patterns for complex outputs. Ready to transform your LLM outputs from a wild beast into a well-behaved JSON generator? Buckle up, because we’re about to turn that unpredictable text generator into your personal structured data dispenser! Picking up where our deep dive on structured generation using constrained token generation left off, let’s turn theory into practice with a hands-on guide to structured generation! In this guide (and the accompanying video), we’re diving deep into the world of JSON generation with LLMs.

And trust me, if you’ve ever found yourself desperately parsing through paragraphs of explanatory text just to find that one JSON object you asked for, this is going to be your new favorite bedtime... Warning: side effects may include significantly fewer 3 AM production incidents!😉 Let’s start with what I like to call the “overly helpful assistant syndrome.” Watch what happens when we try to get a simple JSON object for a video game character: Oh boy. Our LLM friend here decides to throw in a doctoral thesis worth of explanations, complete with Python examples, implementation suggestions, and probably its grandmother’s secret recipe. It’s like asking for directions and getting the entire history of cartography!

🗺️ You've been there. You ask GPT to "return a JSON object with the user's name, email, and sentiment score." It returns a perfectly formatted JSON... wrapped in a markdown code block. With a helpful explanation. And a disclaimer about how it's an AI.

So you write a regex to strip the code fences. Then another regex for the trailing commentary. Then it randomly returns JSONL instead of JSON. Then it wraps everything in {"result": ...} when you didn't ask for that. Then it works perfectly for 10,000 requests and fails catastrophically on request 10,001 because the user's name contained a quote character. This is the structured output problem, and in 2026, you should not be solving it by hand anymore.

Every major LLM provider now offers native structured output. The tooling (Pydantic for Python, Zod for TypeScript) has matured enormously. And yet, most developers are still either parsing raw strings or using function calling as a hacky workaround. This guide covers everything: how structured output actually works under the hood, how to implement it across OpenAI, Anthropic, and Gemini, the Python and TypeScript ecosystems, and — most importantly — the production pitfalls... Large Language Models excel at generating human-like text, but their free-form nature presents challenges when applications need predictable, machine-readable outputs. Whether you’re building chatbots that need to extract user intents, data pipelines that process LLM responses, or agents that execute specific actions, converting unstructured text into reliable structured data is essential.

This guide explores the techniques, tools, and best practices for parsing LLM outputs and generating structured responses that integrate seamlessly with downstream systems. Large Language Models are fundamentally trained to predict the next token in a sequence, making them exceptional at generating fluent, contextually appropriate text. However, this same characteristic creates significant challenges when applications require structured, predictable outputs. An LLM might respond to a data extraction request with beautifully formatted prose that’s difficult to parse programmatically, or it might return valid information wrapped in conversational filler that requires complex post-processing. The unpredictability manifests in several ways. First, formatting inconsistency means that even when prompted to return JSON, an LLM might include markdown code fences, explanatory text before or after the JSON, or malformed syntax that breaks standard parsers.

Taming Llm Outputs Your Guide To Structured Text By Medium

People Also Search

Large Language Models (LLMs) Are Like Wild Animals — Powerful

For Example, When Implementing An LLM-as-a-judge Approach, We May Want

Posted On Jul 11, 2025 • Edited On Mar 7

Let Me Know If You Have Any Other Text You'd

Outlines Solves This By Enforcing Structure At The Generation Step,