Migrating from App Engine to Google Kubernetes Engine

breput · on Sept 29, 2021

I would be interested in seeing this type of article for migrating an established App Engine "standard" environment application instead of the "flexible" environment, which is basically managed containers.

The first-generation standard App Engine service was/is a PaaS environment that provided a lot of functionality on top of the basic load balancing servers, such as an automatically high availability distributed NoSQL datastore, memcache, task queues, mail services, cron, and some special libraries. So migrating off of this involves not just the infrastructure but also a considerable amount of re-implementing equivalent services.

alooPotato · on Sept 29, 2021

We migrated from Appengine Standard to GKE. All those additional services you referenced are available as standalone APIs in GCP so we just migrated the compute to GKE and had our application code call the GCP APIs for all those services instead of the appengine library calls.

breput · on Sept 29, 2021

I think there is some functionality such as google.appengine.ext.db (not ndb) which is not directly available, though? And task queues seem to be completely different. But I accept that for many applications, there are close alternatives.

Edit: And as a point of reference, our application has hit basically every single GAE limitation/quota over the years, including the 10,000 file limit (mostly source code files). So migration is a serious concern.

alooPotato · on Sept 29, 2021

I think that ext.db stuff is just a wrapper around Cloud Datastore. As for the task queues, they are almost identical - see Google Cloud Tasks (https://cloud.google.com/tasks). You might be thinking of google cloud pub/sub as the completely different one.

The way we did the migration was to first move our application code to use the first class google cloud APIs and not rely on any GAE standard libs. Once we got to that point we moved our application code to GKE. Not trying to make it sound easy, it was kind of a pain.

breput · on Sept 29, 2021

I was specifically talking about push queues, not pub/sub. My comments are also based off of migrating from Python 2.7 vs. 3 where there are additional migrations challenges and GAE limitations.

At least in our experience, it was (or will be...we'll see) easier to migrate off to a generic container based environment.

alooPotato · on Sept 29, 2021

push queues are available in Google Cloud Tasks. We use them. There's even an article about how to migrate from the appengine ones to the google cloud tasks one: https://cloud.google.com/tasks/docs/migrating

jamesvnz · on Sept 29, 2021

Rather surprised they didn't migrate from App Engine to Cloud Run. This would seem like the simpler and more direct migration path.

bpye · on Sept 29, 2021

Agreed. I think Cloud Run is the more interesting product and something I wish other providers had similar offerings to. For someone wanting to make a prototype or such then having it scale to 0 is pretty great.

dmlittle · on Sept 29, 2021

I have minimal (almost zero) experience with Google Cloud but if were to guess the reason to not consider Cloud Run is the pricing model. While solutions like Cloud Run (and Lambda or Fargate in AWS) give you a lot of flexibility they are significantly more expensive than orchestrating servers and services yourself. At a small scale the increase pricing is worth it for flexibility but there's definitely a point at which it switches and it's too expensive and not worth it anymore. From the blog post it seems they have a sustained average of 30,000 requests per second which probably puts them in the category where rolling it themselves is financially worth it.

Comparing Cloud Run pricing with VMs and Preemptible VMs you can see why. I'm going to ignore networking bandwidth costs as I assume (maybe incorrectly) that they're the same across Google Cloud services. That leaves us with CPU, Memory and Requests pricing.

CPU Pricing:

Cloud Run: $0.00002400 per vCPU-second [1]

On-demand VM: $0.00000605861 per vCPU-second ($0.021811 / vCPU hour) (~75% cheaper vs. Cloud Run) [2]

Preemptible VM: $0.0000018175 per vCPU-second($0.006543 / vCPU hour) (~92% cheaper vs. Cloud Run) [2]

Memory Pricing:

Cloud Run: $0.00000250 per GiB-Second [1]

On-demand VM: $0.00000081194 per GiB-second ($0.002923 / GB hour) (~67% cheaper vs. Cloud Run) [2]

Preemptible VM: $0.0000002436 per GiB-second($0.000877 / GB hour) (~90% cheaper vs. Cloud Run) [2]

Requests Pricing:

This is complicated since Cloud Run charges per request ($0.40 per million requests[1]) which would put them at roughly $378,000 per year (at a 30k request per second average). Load balancers charge per hour and per bandwidth (regardless of traffic of which they have it sustained) at $0.025 per hour per proxy instance and $0.008 per GB of data transfer[3]. I'm not sure how it compares but I'm sure it's much on an averaged our per request basis.

Looking at the CPU and Memory alone you can see the price differential and why it might not be worth it when you're paying hundred of thousands of dollars per month in infrastructure per year (they might spending much more but at this scale it is when its starts to matter). Assuming a middle ground somewhere in the 80% savings using on-demand and preemptibles vs. cloud run at low costs you're talking about $60/mo vs $300/mo respectively but once you're in the $50,000/mo range you're talking about the difference between $50,000/mo using on-demand/preemptibles vs $250,000/mo on Cloud Run.

[1] https://cloud.google.com/run#section-14 [2] https://cloud.google.com/compute/vm-instance-pricing#general... [3] https://cloud.google.com/vpc/network-pricing#lb

jamesvnz · on Sept 29, 2021

Good point on the sustained usage changing the economics at the scale they're at, I didn't really factor that in.

The pricing calculation is even more complex than it seems. Cloud Run doesn't charge per request if you have "always allocated CPU" (a new feature). Also, instances can handle up to 1000 concurrent requests. So, we'd need to know processing time and work out how many instances would need to run concurrently. There's also committed usage discounts for Cloud Run.

Another factor to consider is operational cost - Cloud Run has a lower ops burden than K8s - particularly if you wanted to run multi-regional instances.

But you're right - at the sustained usage they have, the infrastructure cost of Cloud Run would almost certainly be worse than VMs in GKE, particularly if they could run on a decent percentage of preemptible instances.