Dev.to
6/7/2026

LLM Wire Format Benchmark: Which Format Can AI Actually Read and Write?
Short summary
A benchmark comparing JSON, TOON, and GCF wire formats across 10 LLM models shows JSON collapses at 500+ records—GPT-5.5 returns empty strings, Opus miscounts 500 as 356. GCF achieves 100% comprehension and 5/5 valid generation on all frontier models while using 79% fewer tokens. GCF's hierarchical design prevents attention decay and eliminates the column-filtering computation that fails at scale.
- •JSON breaks at 500+ records; GCF maintains 100% accuracy across all tested models
- •GCF uses 79% fewer tokens than JSON while being more accurate than both JSON and TOON
- •Hierarchical structure (GCF) outperforms flat tabular formats because attention doesn't decay and filtering is implicit
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



