Dev.to
5/12/2026

Kenji's Ramen: How Gemma 4 Runs the NPC That NVIDIA's Demo Never Built
Short summary
A developer built Kenji Sato, a sophisticated NPC running locally on Gemma 4 (5B params) with trust tiers and real refusal behavior, outperforming NVIDIA's cloud demo. The character uses a 17-section contract to implement dynamic responses based on relationship depth, maintaining behavioral integrity across all model sizes. Open-source code, benchmarks across models, and test suites included; runs at 3.5s/turn on consumer GPU.
- •Builds a bounded NPC with trust tiers, disclosure gates, and character depth using local Gemma 4 models
- •Benchmarks multiple models; Gemma 4 e2b (5B) achieves 3.5s/turn on consumer hardware with perfect boundary adherence
- •Open-source with test suites, interactive demo, and architectural details for local model deployment in games
Generated with AI, which can make mistakes.
Is this a good recommendation for you?


