The cases where optimizing compilers aren't good enough are where Java Hotspot Compiler and similar techniques really shines. Combined with novel superoptimization techniques, hotspot optimization could far outperform hand-optimization (although AFAIK that hasn't happened in practice yet).
I completely agree, although hotspot is really hampered by the fact that objects are allocated willy nilly on the heap instead of together for pipeline efficiency. So while hotspot (and graal) generate lovely assembly the lack of data locality kills a lot of possible performance. Hoping objectlayout.org changes that!
Yet I think that hotspot and intrinsics are a nice case study of showing why optimising compilers are not dead. Even for performance critical code. https://news.ycombinator.com/item?id=9368137 discusses in part how intrinsics (hand optimised) at some point get beaten by the optimiser (simd/superword) and then hand-optimised again to again beat the optimiser. Mostly, because machines change.
A whole problem with static binaries as produced by C and GO compilers is that they assume machines are static. Which leads to lowest common denominator optimiser settings :( When the optimisers are humans this gets even worse. Then you end up with optimisations in your C code that was a good idea 15 years ago, but that make no use of SIMD today (or to short SIMD e.g. SSE2 when AVX-512 is available).
Of course real optimisations happen, not by doing the same thing faster but by doing a faster thing. Take for example the HMMER (https://news.ycombinator.com/item?id=9368137) family of algorithms. HMMER2 is part of SPECCPU, and the compiler guys doubled the speed of this algorithm, in about 5 years. Then the bioinformaticians redid it as HMMER3 which is quite different in how it globally works and gets 100x speed up in practice.