Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't know for sure, but this is generally common because caches get cold.

A lot of websites use a cache in front of databases (or template rendering engines, or many other systems). That cache might evict entries based on time - after 5 minutes, the entry is considered invalid.

But that means that if you have no traffic for 10 minutes, the cache completely empties. Then when traffic returns, it all skips the cache and actually triggers a real hit to the backend - which is now overwhelmed with traffic. The cache protects the backend in normal behavior, but now it's not doing its job, so the backend has many more requests than usual.

In the worst case, those requests are enqueued in a big serial sequence... but the ones at the back of the queue may time out. The client may do something like say "it's taken me 5 seconds and I still don't have a response - I'll abort and retry!" and now you have even _more_ traffic to deal with.

So cold caches and retries can conspire to keep a service down for a long time even after the root cause is fixed.



I'm accustomed with cache-eviction policies based on LRU, age, etc. But in my systems, eviction happens only when (a) the content is known to be invalid, or (b) there's competition for cache space.

IIUC the parent comment, it's describing a policy that evicts entries even (a) and (b) are false. Is that common in the web-hosting / CDN world? Or is age considered a proxy for stale?


Right, age is used as a proxy for stale, because we often don't have anything better.

A lot of web systems work this way - DNS records for example use a "TTL" which means "time to live." If the TTL is 60, then you throw it out of the cache after 60 seconds even if you have room in the cache, and you have no reason to believe it's invalid. This lets independent entities (like a DNS authority) make a change and get it rolled out everywhere.

I think the reason this is common is that proving cache invalidity is so hard, especially with the typical "dumb" cache appliances that are widely used. They just do stuff like cache the response bytes for a particular URL; they might not even understand HTTP beyond interpreting the request's headers, and certainly don't really understand the response.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: