Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Gradual deployments are a more reliable defense against bugs than careful programming

The challenge, as I understand it, is that the feature in question had an explicit requirement of fast, wide deployment because of the need to react in real time to changing external attacker behaviors.





Yeah, I don’t know how fast “fast” needs to be in this system; but my understanding is this particular failure would have been seen immediately on the first replica. The progression could still be aggressive after verifying the first wave.

yep, and it was this exact requirement that also caused the exact same outage back in 2013 or so. DDoS rules were pushed to the GFE (edge proxy) every 15 seconds, and a bad release got out. Every single GFE worldwide crashed within 15 seconds. That outage is in the SRE book.

Is there a link to the SRE book?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: