r/LocalLLaMA
· Communities
New bench designed for smaller models: ObviousBench.com
AI can build entire companies in a day, yet can still struggle to spell its own name - and users notice. So I built ObviousBench.com: a benchmark for visible LLM failures. The surprising result is not just which model wins. It is how much visible failure risk changes when teams choose smaller, cheaper, faster, or lower