Back to feed
Dev.to
Dev.to
6/7/2026
LLM Wire Format Benchmark: Which Format Can AI Actually Read and Write?

LLM Wire Format Benchmark: Which Format Can AI Actually Read and Write?

Short summary

A benchmark comparing JSON, TOON, and GCF wire formats across 10 LLM models shows JSON collapses at 500+ records—GPT-5.5 returns empty strings, Opus miscounts 500 as 356. GCF achieves 100% comprehension and 5/5 valid generation on all frontier models while using 79% fewer tokens. GCF's hierarchical design prevents attention decay and eliminates the column-filtering computation that fails at scale.

  • JSON breaks at 500+ records; GCF maintains 100% accuracy across all tested models
  • GCF uses 79% fewer tokens than JSON while being more accurate than both JSON and TOON
  • Hierarchical structure (GCF) outperforms flat tabular formats because attention doesn't decay and filtering is implicit

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more