Before you build a single metric, you have to read your AIs failures and name them. Error analysis the highest-leverage, most-skipped step in evals on a live .NET product.
AI Evals, Part 2: Error Analysis The Unglamorous Superpower Behind Good Evals
Before you build a single metric, you have to read your AIs failures and name them. Error analysis the highest-leverage, most-skipped step in evals on a live .NET product.
When your voice AI can't hit 300ms, you stop trying to be fast and start lying convincingly. Five perception hacks I use, with cost, payoff, and where they backfire.
I've been writing about AI coding tools for months here on Dev.to. Comparisons, benchmarks, tutorials...
How a Phaser 3 endless vertical climber went from idea to public release — procedural levels that are provably beatable, chiptune juice, and 53 tests.