AI article
I needed to know if the cheaper model was good enough. So I built an LLM-as-a-Judge pipeline
Benchmarks are useful, but they don't really tell me whether a prompt change or cheaper model is good...
Dev.to | Apr 6, 2026 | archminor
AI article
Benchmarks are useful, but they don't really tell me whether a prompt change or cheaper model is good...
Dev.to | Apr 6, 2026 | archminor