Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In my experience everyone openly talks about how benchmarks are bullshit. On Twitter or on their podcast interviews or whatever everyone knows benchmarks are a problem. It's never praise.

Of course they tout benchmark numbers because let's be real, if they didn't tout benchmarks your not going to bother using it. For example if someone posts some random model on huggingface with no benchmarks you just won't proceed.

Humans have a really strong prior to not waste time. We always always evaluate things hierarchally. We always start with some prior and then whatever is easiest goes next even if its a shitty unreliable measure.

For example, for Gemini 3 everyone will start with a prior that it is going to be good. Then they will look at benchmarks, and only then will they move to harder evaluations on their own use cases.



I don't use them regardless of the benchmarks, but I take your point.

Regardless though, I think the marketing could be more transparent




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: