Why I'm Using TOON with LLMs

11.19.2025

If you’ve ever pasted a massive JSON file into an LLM and watched your token count (and costs) skyrocket, you know the struggle. TOON (Token-Oriented Object Notation) is a new way to format data that is designed for efficient machine to AI communication.

Braces, quotes, and repeated field names, all count toward your LLM costs and total context length. That’s where TOON comes in as a tiny and friendly rescue format built just for this problem, making it both easier and cheaper for AI to parse your data.

TOON keeps every bit of structure and meaning from JSON, but trims the fat.

So, What Is TOON?

TOON is a compact, human readable encoding of the JSON data model for LLM prompts. Think of TOON as a translation layer. You keep using JSON for your code, but you convert it to TOON when sending it to AI.

Why TOON?

Being that it uses fewer tokens than standard JSON, you can fit more data into a single prompt or across multiple prompts for that matter. This will ultimately save you money, while also giving you the best of both worlds.

TOON uses indentation like YAML to show how data is nested, but switches to a table-like layout (like CSV) for lists. This design was specifically implemented to help AI parse your data with ease.

Here’s a comparison of the formats visually:

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

Standard JSON is great for machines, but it’s still a token hog. It’s verbose and token expensive. For uniform arrays of objects, JSON repeats every field name for every record.

users:
  - id: 1
    name: Alice
    role: admin
  - id: 2
    name: Bob
    role: user

YAML reduces some of that redundancy by using indentation rather than braces.

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

TOON cuts the bloat even more, declaring your headers once and then streaming the rest of the data as compact rows.

When Not to Use TOON

According to the TOON documentation, there’s definitely times where this encoding is not the best choice. Deeply nested & irregular structures might not benefit as much from TOON encoding, but for tabular or repeated patterns it’s a no brainer optimization.

The following are examples of when not to use TOON, according to the documentation:

Deeply nested or non-uniform structures (tabular eligibility ≈ 0%): JSON-compact often uses fewer tokens. Example: complex configuration objects with many nested levels.
Semi-uniform arrays (~40-60% tabular eligibility): token savings diminish. Use JSON if your pipelines already rely on it.
Pure tabular data: CSV is smaller than TOON for flat tables. TOON adds minimal overhead (~5-10%) to provide structure (array length declarations, field headers, delimiter scoping) that improves LLM reliability.
Latency-critical applications: benchmark on your exact setup. Some deployments (especially local/quantized models) may process compact JSON faster despite TOON’s lower token count.

When You Should Use TOON

TOON shines with uniform arrays of objects, data with the same structure across items. Think of uniform arrays of objects, like rows from an API, event logs, listings, or any pipeline where token cost is a real factor. It’s optimized for specific use cases.

It aims to:

Make uniform arrays of objects as compact as possible by declaring structure once and streaming data.
Stay fully lossless and deterministic – round-trips preserve all data and structure.
Keep parsing simple and robust for both LLMs and humans through explicit structure markers.
Provide validation guardrails (array lengths, field counts) that help detect truncation and malformed output.

All of this contributes to lower API charges, more usable context and often smoother parsing results when you’re programmatically interacting with models.

Let’s Look at an Example

The following is a simple experiment showing the difference in token usage between JSON & TOON formats:

{
  "context": {
    "task": "Our favorite hikes together",
    "location": "Boulder",
    "season": "spring_2025"
  },
  "friends": [
    "ana",
    "luis",
    "sam"
  ],
  "hikes": [
    {
      "id": 1,
      "name": "Blue Lake Trail",
      "distanceKm": 7.5,
      "elevationGain": 320,
      "companion": "ana",
      "wasSunny": true
    },
    {
      "id": 2,
      "name": "Ridge Overlook",
      "distanceKm": 9.2,
      "elevationGain": 540,
      "companion": "luis",
      "wasSunny": false
    },
    {
      "id": 3,
      "name": "Wildflower Loop",
      "distanceKm": 5.1,
      "elevationGain": 180,
      "companion": "sam",
      "wasSunny": true
    }
  ]
}

That JSON input would total 229 tokens and is 680 characters long. Here’s the same data, encoded with TOON:

context:
  task: Our favorite hikes together
  location: Boulder
  season: spring_2025
friends[3]: ana,luis,sam
hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
  1,Blue Lake Trail,7.5,320,ana,true
  2,Ridge Overlook,9.2,540,luis,false
  3,Wildflower Loop,5.1,180,sam,true

TOON encoding brings the token count down to 104 and the character count to 286, a 54.6% decrease. You can even make choices for delimiters, which could increase token efficiency even more.

TL;DR

TOON is a streamlined way to feed structured data to AI without the “JSON tax” we now understand. By cutting out repetitive formatting, it slashes your token costs and makes your data much easier for models to process reliably.

Whether you’re sending massive datasets to a model or asking it to generate structured responses, TOON ensures your AI interactions stay fast, cheap, and accurate.

Give it a try next time you’re passing structured data into a model, your bank balance (and context window) will thank you. If you’re ready to dive in and begin using TOON, here is the documentation and introduction guide.