let's say you a bunch of database timeouts in a row. this might mean that nothing needs to be fixed. But also, the "thing that needs to be fixed" might be "the ethernet cable fell out the back of your server".
You have an alert on what users actually care about, like the overall success rate. When it goes off, you check the WARNING log and metric dashboard and see that requests are timing out.
Well, yes. If the cable falls out of the server (or there's a power outage, or a major DDoS attack, or whatever), your users are going to experience that before you are aware of it. Especially if it's in the middle of the night and you don't have an active night shift.
Expecting arbitrary services to be able to deal with absolutely any kind of failure in such a way that users never notice is deeply unrealistic.
How do you know?