Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sadly at least some of the points of relevance of this quote are outdated.

For example, it would behoove a large company to spend a great deal of time optimizing say, their JSON parsing library. Although it may not identify as a hotspot in any one place in their immense codebase, it's extreme prevalence causes performance degradation subtly but pervasively.

I also measured injected object creation using Guice to be 40x slower than a simple constructor in java (agree or disagree with the 40x, but using reflection to set a variable instead of simple object construction is intuitively far slower).

Guice may not show up on any profiler as problem - but if you slow down object creation by a factor of 40x, something you may do thousands of times per second for the life of your program, you are degrading performance across the board. Rather the same if you simply clock your CPU down a few hundred Mhz.



To counter some of the naysayers, suppose your JSON parsing library or Guice object instantiation is slow. In my case, my CSV parsing library is slow. You will change your code to avoid these things. In my case, I’m considering converting the CSV files to Thrift or Protocol Buffers, which means that I’ll have hundreds of gigabytes of reformatted data that I’ll need to rebuild when the CSV changes. But then the CSV parsing would not show up in the profiler any more. If, instead, I magically had a CSV parser that was ten times faster, I would have an overall simpler system that didn’t have to manage a cache of hundreds of gigabytes of reformatted data in order to avoid CSV parsing.

So it’s entirely possible that a slow JSON parsing library (or DI library or whatever) could result in an overall slow and overcomplicated system, and yet not show up in the profiler, because the programmers had added enough overcomplicated, somewhat faster alternatives to avoid using the JSON library much.


"Guice may not show up on any profiler as problem - but if you slow down object creation by a factor of 40x, something you may do thousands of times per second for the life of your program, you are degrading performance across the board."

If your profiler doesn't point right at the thing that you're doing 40x slower thousands of times, either your profiler is BROKEN or that isn't actually a bottleneck.


If you have ten thousand different kinds of objects then the creation of instances of any given class will not rise to the top of the profile.


When I care about performance over most anything else, I take this guy's approach: https://www.youtube.com/watch?v=Kdwwvps4J9A

It's not appropriate for every project, but if maximizing performance is your goal, you should be tracking performance constantly as you work, not as an afterthought late in the project.

Late profiling leaves too much room for a thousand small inefficiencies to add up, and it adds too much time between making a decision and understanding its impact on performance.


I'm not sure how what you describe is any different than focusing on the critical parts of the program. If a function is short but is called many many times, that could still be a hot spot.

It's also worth noting that even then, it still may not matter if you're stalled out on something like I/O or whatever (while still keeping in mind other constraints like battery life, CPU throttling, etc).


> Guice may not show up on any profiler as problem

The it isn't a problem.

> but if you slow down object creation by a factor of 40x, something you may do thousands of times per second for the life of your program

Then it'll show up in the profiler. You aren't going to do something that slows down the program across the board yet doesn't create new hot spots that show up in the profiler.


Counterexample: anything that involves cache.

You do something that blows away your cache, it'll show up as slowdowns elsewhere in the program.


> You do something that blows away your cache, it'll show up as slowdowns elsewhere in the program.

Presumably that will show up as additional cache misses at those points in the profile, as suggested, assuming the profiler can count more than cycles on the relevant hardware.


Unfortunately, it is really difficult to diagnose cache effects.

Do you have a good idea of what levels of cache misses are typical? Do you regression-check cache miss rates between versions?

What about the whole "debugging your program makes it slower" observer-effect? Do you know what effect your profiler has on cache?

And again, you do something that blows away cache it'll show up as cache misses at scattered random other parts of your problem. Even if you catch it can (will) be hard to track back to the actual source.


I thought it was regression which was at issue here.

I'm not convinced cache is so different from all the other aspects of performance we have to deal with in HPC systems (insofar as they're isolated), but no matter. At least there's plenty of tool support and literature.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: