> While we don’t yet officially support 130K nodes, we're very encouraged by these findings. If your workloads require this level of scale, reach out to us to discuss your specific needs
Obviously this is a typical experiment at Google on running a K8s cluster at 130K nodes but if there is a company out their that "requires" this scale, I must question their architecture and their infrastructure costs.
But of course someone will always request that they somehow need this sort of scale to run their enterprise app. But once again, let's remind the pre-revenue startups talking about scale before they hit PMF:
Unless you are ready to donate tens of billions of dollars yearly, you do not need this.
People at my co are horny to adopt k8s. Really, tech leads want to put it on their resume ("resume driven development") and use a tool that was made to solve a particular problem we never had. The downside is now we now need to be proficient it at, know how to troubleshoot it, etc. It was sold to leadership as something that would make our lives easier but the exact opposite has happened.
I think k8s has a learning curve, absolutely, and there are absolutely cases where it can be unnecessary overhead. But I actually think those cases are pretty small. If you're running multiple apps, k8s is valuable. There is initial investment in learning the system, but its v-extensible, flexible, & portable. (Yes, every hyperscaler's implementation of k8s has its own nuance in certain places, but the core concept of k8s translates very well)
Use killercoda and get your CKA, I bet most of the confusion will be gone. I've basically started mandating it for newer folks on my team since it covers so many of the gaps that get created by people who try Just In Time learning on the systems. K9s is great for visual people who are used to vim.
I work for a mature public company that most people in the US have at least heard of. We're far from the largest in our industry and we run jobs with more than that almost every night. Not via k8s though.
You have jobs running on more than 130k different machines daily??
Are they cloud based VMs, or your own hardware? If cloud based, do you reprovision all of them daily and incur no cost when you are not running jobs? If it's your own hardware, what else do you do with it when not batch processing?
Obviously this is a typical experiment at Google on running a K8s cluster at 130K nodes but if there is a company out their that "requires" this scale, I must question their architecture and their infrastructure costs.
But of course someone will always request that they somehow need this sort of scale to run their enterprise app. But once again, let's remind the pre-revenue startups talking about scale before they hit PMF:
Unless you are ready to donate tens of billions of dollars yearly, you do not need this.
You are not Google.