The thing that made Google SRE scale was in the fine print and covered in this article. SRE is a premium service, and in order to get the premium service you had to meet some criteria to show you weren't about to turn highly paid, highly knowledgeable folks into professional firefighters.
During my time as a SRE at multiple organizations what I witnessed was executives that owned Operations teams renaming the team, paying for a Python course, and calling it "change". Operations Engineers have a culture unto themselves that will not be broken in a years time. A culture of firefighting and heroism as opposed to proactivity and dull standards. They're rarely, if ever, programmers which was a prerequisite for a SRE-SE and SRE-SWE. You don't need to be a master in logic to know that if you dress your old pig up with lipstick and a new title that it's still a pig.
I agree with the article: read the SRE book and understand why they did what they did, and if you hire SRE-SE or SRE-SWE types, don't colocate them with Operations types or make them a 1:1 map with your dev teams that haven't earned them.
I heard about that; amongst other things, if the amount of times SRE gets pinged because of a software fault exceeds X amount, they basically give the pager back to the team that built the software. They aren't taking responsibility for reliability if the software is not reliable.
I really appreciated working with a team of SRE types, it gave me a newfound appreciation for quality in software development. I remember one instance where a colleague in my team went up to the 'ops' team and wanted SSH access to a production server to check on some settings (environment variables, I believe). He thought it was a trivial thing, you know, "just lemme have a peek" kinda thing, but the ops team flat-out told him no, if he needs to print env vars, he can do it in his own code - we did continuous releases, a patch could be deployed within minutes if passing all the checks. I loved that the ops team had the mandate from higher up to say no to requests like this, and I found them a lot more professional than the software developers, including myself.
Ideally what that developer was wanting should be a function of the platform. That's likely not in an operations teams scope because they're mostly just fire fighters. Also ideally in a situation like SOX where devs can't have access to production their preproduction environments share the same interface that production has and if their values differ that much is documented.
> the ops team flat-out told him no, if he needs to print env vars, he can do it in his own code
This is still the wrong attitude from a "productization" of infrastructure perspective. Configuring the environment a program runs in is a core responsibility of infrastructure. So is being able to query it. Build that infrastructure product feature.
Or select a platform that makes this trivial, e.g kubernetes.
During my time as a SRE at multiple organizations what I witnessed was executives that owned Operations teams renaming the team, paying for a Python course, and calling it "change". Operations Engineers have a culture unto themselves that will not be broken in a years time. A culture of firefighting and heroism as opposed to proactivity and dull standards. They're rarely, if ever, programmers which was a prerequisite for a SRE-SE and SRE-SWE. You don't need to be a master in logic to know that if you dress your old pig up with lipstick and a new title that it's still a pig.
I agree with the article: read the SRE book and understand why they did what they did, and if you hire SRE-SE or SRE-SWE types, don't colocate them with Operations types or make them a 1:1 map with your dev teams that haven't earned them.