Can anyone share their experiences using AWS Lambda in Production?

tjholowaychuk · on Nov 20, 2016

I wrote the apex(1) tool and created https://apex.sh/ping/ with Lambda, in general it has been great, has scaled flawlessly since launch (granted I'm only doing ~8M requests / day).

Conceptually I think it's great for pipelines or use-cases like this, VMs are generally a terrible level of abstraction for a lot of problems, and the Lambda style promotes better architecture because of this.

The connectivity between Kinesis/SNS and friends is great. I'd agree that Lambda is not currently a good fit for "regular" apps, APIs should be fine now that the proxy stuff is in there, though there's slight latency.

No need to worry about gracefully stopping or restarting daemons, just push new code and the old stuff goes away, it really is a great abstraction that way. Basically replace anything you'd use a Go channel for, with more Lambda or SNS->Lambda if you need retries and backoff, it'll spare you a lot of code.

I find the workflow great as well, the slowest part for me is compiling the Go binaries, the rest is virtually instant. Especially now with all this needlessly complex Docker stuff it's refreshing to use something simple.

Cost is prohibitive for sustained use, so make sure you price things out properly, it sounds very cheap until you look at say a constant 100 requests/s behind API Gateway. It's easily 300-400% what you'd pay on EC2.

Cold-start is really a non-issue in most cases, it seems to take very little to keep a function warmed, so unless you get zero traffic (which would be dirt cheap on a t2.micro anyway) you'll be fine.

boyd · on Nov 20, 2016

Had never heard of apex. Looks really nice and giving it a try now!

Edit: Feedback: Really like it. One feature that would be very nice would be the ability to trigger a ping on demand, e.g., for testing auth set up on a request. Runscope implements a similar feature. Otherwise great so far!

tjholowaychuk · on Nov 20, 2016

Thanks! Agreed, I have that on the list, quick sanity check is always good. Taking a bit of a break to work on other products but I'll keep adding to it.

tedmiston · on Nov 20, 2016

I've looked at Apex a few times. I was going to ask you for a hacker plan but it sounds like you're already doing such numbers it might not be justified. It would be really nice for me to be able to have a master subscription for my contracting work for client projects.

tjholowaychuk · on Nov 20, 2016

I've thought about adding a smaller plan, I still might at some point but it certainly reaches a level where it's not really worth it, especially since I want to provide equal support to everyone. I'll have to experiment with that.

I had limited free plans originally but that went horribly wrong haha, free users only attract other free users, a few days later I had like 4000 free people. Maybe that works for startups, but not "real" companies.

voidfunc · on Nov 21, 2016

I used Apex briefly a few months back and was impressed but I didn't fall in love with Lambda so I'm not using it anymore.

acid__ · on Nov 20, 2016

Heads up -- Chrome is blocking is blocking a lot of your site's requests to lambda endpoints (net::ERR_INSECURE_RESPONSE) :)

fosk · on Nov 21, 2016

If you need an API Gateway that can work with Lambda, but with better performance and feature set, take a look at https://github.com/Mashape/kong - there is an open PR for Lambda support which will land in the next version (I am one of the core committers).

markonen · on Nov 21, 2016

This is very interesting. Especially when you have a large volume of trivial requests, the cost-per-request of AWS API Gateway dominates overall cost (the Lambda functions themselves are cheap by comparison). Rolling one's own in EC2 can potentially be much cheaper.

tedmiston · on Nov 20, 2016

Can you elaborate on the slight latency you've experienced?

tjholowaychuk · on Nov 20, 2016

I haven't tested it extensively but if you boot up API Gateway and a hello-world Lambda function, the function itself takes maybe 1ms to run, while API Gateway seems to add roughly 150-200ms on top of that.

zbjornson · on Nov 20, 2016

We gave it a run in prod and abandoned it quickly. We were using it for parallel file processing (S3->lambda for CPU-heavy task->return a few numbers in a JSON).

Last I calculated, it's nearly 5x the cost of comparable t2 on-demand EC2 instances. That can be mitigated if you have spiky traffic where you'd need to turn on several EC2 instances for less than an hour but are stuck paying for the full hour, or if you can scale to zero for extended periods of time.

Critical to us: you don't have the ability to serve concurrent requests from a single lambda worker, so if you're waiting for async IO (e.g. downloading from S3) then you're wasting money you wouldn't be wasting with EC2. You end up with one file per lambda worker, whereas we could handle about five files concurrently on a normal VM because of async IO. -- For us lambda was prohibitively expensive at scale. I suspect a fair number of people reporting significant savings vs EC2 had over-provisioned instances.

tlrobinson · on Nov 20, 2016

It would be interesting if someone made an EC2 image that implemented the Lambda APIs, and handled as many requests as it could, then optionally delegated the overflow to real Lambda. Like "reserved" Lambdas.

pugz · on Nov 20, 2016

This is a _fascinating_ idea. You could make an image that queried the Lambda API, downloaded all Lambda functions (and their configuration) and served up exactly those functions. Make it completely turnkey.

Hm...

treeder · on Nov 23, 2016

You might want to check out IronFunctions that was released last week: https://github.com/iron-io/functions . You can run Lambda functions anywhere, can even export/import them directly from Lambda. It doesn't have the burst to Lambda part you speak of though... yet.

scrollaway · on Nov 20, 2016

I wrote a blog post about how we process Hearthstone log files into replays for https://hsreplay.net

https://hearthsim.info/blog/how-we-process-replays/

TLDR: AWS Lambda is awesome if you have indeterminate-but-high amounts of small CPU-bound tasks you want to be able to do in parallel, as soon as they are needed. Awesome for file processing (eg. image resizing) for example.

But being stuck on Python 2.7 sucks. PLEASE Amazon, announce Python 3 support already.

pilom · on Nov 20, 2016

Using Python for an API call that is basically a thin wrapper around one of our supplier's (not very good) APIs running ~500k requests per day. We've had problems twice with 500 errors for all requests at night when they do maintenance on the API gateway/lambda hardware and don't tell us (in both cases they announced new features the next morning). We reported it to support both times and got "we're sorry, but since it's all good now there is nothing to do" as the response both times. Left a bad taste in our mouth and so we haven't deployed anything new to it and probably won't for a while despite it working well otherwise.

jayzalowitz · on Nov 20, 2016

Those 500s may mean you aren't deleting your ENIs. Make sure you have privileges to delete on your role.

hn_user2 · on Nov 20, 2016

We tried about a year ago and gave up after too much wasted effort on getting any kind of a good dev/CI/CD pipeline going. Mainly due to lack of tooling. Things like environment variables, etc.

Taking another look now on some side projects, to get a sense of if all the of the issues have been addressed.

But the big ones were lack of env variables, and the tooling was atrocious.

I really like the concept, and the current serverless (serverless.com) framework has been simplified, runs based on cloud formation, and also allows the project to be run as an express server, and potentially with additional plugins on a competing serverless architecture.

Jdam · on Nov 20, 2016

Don't use it if you require low latency - performance is abysmal and the bottleneck is Lambda overhead that you cannot influence. Also, if you don't have a high traffic function, cold start is also an issue, at least when using Java. There is a "hack" to avoid this issue: create a cloudwatch check that triggers your function every minute or so. This sounds weird but if you consider that this is good practice, this gives you an overview how production-ready Lambda is. Nice thing is that is's pretty cheap though.

tedmiston · on Nov 20, 2016

Warming it every minute seems kind of excessive. Did you really have to do it that much? I haven't experienced this with Python but my jobs are more cron-ish so an extra 100 ms doesn't make a difference.

brianwawok · on Nov 20, 2016

Loading JVM from a cold VM is a lot more than 100ms, more like 5-10 seconds. Same problem Google App Engine had 7 years ago before you could pay to keep an instance warm.

tedmiston · on Nov 20, 2016

Ah, okay an issue specific to Java.

Nimsical · on Nov 20, 2016

I've used Lambda in production under pretty high throughput conditions before. It's fairly stable – given that we were using Kinesis to feed into it (which was a pain to setup).

Recently, I've been using stdlib (https://stdlib.com) to do function-services since it's a lot easier to get started with it a bit more intuitive for newcomers to understand.

thisdavej · on Nov 20, 2016

I agree with you and have found stdlib to be very intuitive for newcomers. I like stdlib so much that I created an stdlib intro article at http://thisdavej.com/creating-node-js-microservices-with-eas....

southpolesteve · on Nov 20, 2016

We use Lambda and API gateway for all production[1] requests. HTML, Server side rendered JS, API calls. All through lambda. There are some rough edges. We wrote our own framework/deployment tool[2] to fix some of them. AWS has been making lots of improvements too. Many in just the last few months. This setup was much harder to run when we first started a year and a half ago. Overall our team is super happy. It is cheaper and simpler to operate than our previous ec2+Opsworks setup. We get code to production faster and spend more time on actual business problems vs infastructure problems. I also pretty much agree with everything Ben Kehoe said in this article: https://serverless.zone/serverless-vs-paas-and-docker-three-....

[1] www.bustle.com and www.romper.com do a combined XX million unique visitors per month.

[2] https://github.com/bustlelabs/shep

nchuhoai · on Nov 20, 2016

How do you guys deal with potential cold-start issues?

southpolesteve · on Nov 20, 2016

Mostly it is not a big problem for us. Our functions that need to be fast have very high traffic. Our other functions don't need to be that fast. We also have a CDN in front of most things.

That said, we use node. I have heard cold starts are worse for Python (I have no data). I have heard they are near unbearable for Java (JVM startup and all that). We try and minimize external calls outside the main handler. Previous to this update we were packing configs with webpack at deploy time. Some people read them from S3 or dynamo. I don't consider that a good solution. I'll be working this week on Shep 3.0 which will use the new ENV var features.

I have talked to other people who do various hacks to force their functions to be warm. I've looked at doing this and haven't found it necessary yet for our use case.

cagataygurturk · on Nov 20, 2016

We use route53 health checks to invoke API gateway and thus the backend Lambda.

dbarlett · on Nov 20, 2016

I started with a manually-packaged Python function and credstash [1] to store credentials for non-AWS services. It gets invoked nightly by CloudWatch and has been bulletproof for months. I recently built a microservice [2] for Election Day with Flask and deployed it behind API Gateway with Zappa [3]. Lambda isn't for every use case, but for those it is, I'm bullish.

[1] https://github.com/fugue/credstash

[2] https://topics.arlingtonva.us/2016/11/voter-registration-sea...

[3] https://www.zappa.io/

cagataygurturk · on Nov 20, 2016

You can check out my talk at AWS E-Business Day. For ~50 minutes we are talking how we switched to serverless architectures for one of the largest online retailers of Europe and our learnings.

https://www.youtube.com/watch?v=WYVamFcphxo

bithavoc · on Nov 20, 2016

I use it for one of my clients to send an average of 400k browser push notifications, competitors send then in 40 minutes. With Lambda + kinesis firehose(16 shards) my solution is able to send them in less than 4 minutes.

avitzurel · on Nov 20, 2016

I've been using Lambda in production for multiple things.

I'm not using Apex or any of the other frameworks, just plain Go code wrapped by the JS function.

It's really awesome to work with, you just deploy a function and quit worrying about it.

Things like callbacks, message monitoring and stuff like that.

Example:

Monitor SNS messages for auto-scaling and send a message to slack when something fails. https://github.com/KensoDev/sns-lambda-notifier-golang

jtwaleson · on Nov 20, 2016

Same here, deployed two functions. One thumbnail converter after s3 uploads which is used a couple of times per day, running for over a year. The other is a log aggregation script that runs thousands of times per hour, running for a few months now. Both run flawlessly, never had to worry about them after deploying.

iotsky · on Nov 21, 2016

We use it at https://iotsky.io/ for all backend interaction with aws IoT.

Wrote a blog post about it too: https://medium.com/@iotsky/how-we-built-the-iotsky-backend-u...

gingerlime · on Nov 20, 2016

I wrote Gimel[0] - an A/B testing backend using Labmda and Redis (it can also work with Google BigQuery w/Kinesis).

It's used in production for a few months now and we're really happy with it. It provides a negligible-cost replacement to Optimizely for us.

[0] https://github.com/Alephbet/gimel

djhworld · on Nov 20, 2016

It has been good for us, although our use case has mainly been ETL pipelines which seems quite a good fit, especially when you connect it up to Kinesis Firehose/Kinesis or S3 notifications.

The main issue we've seen is, even at the "top" tier (1536mb) the CPU performance doesn't seem very good, it's difficult to tell as they abstract you away from what is actually going on under the hood.

Main advantages: easy to deploy, no infrastructure worries if your problem fits, event based triggers means you don't have to write any of that code, fits well in the AWS ecosystem.

I've seen people making serverless websites and things like that, but I'm not convinced Lambda is a good fit for such a thing (happy to be proven wrong)

pnathan · on Nov 20, 2016

Didn't get to prod. Looked at it around Juneish 2016. High error rates, no SLA, slow boot times, no tooling, deeply inadequate logging.

My technical recommendation is as follows: "No".

discodave · on Nov 21, 2016

What about the logging is inadequate? CloudWatch Logs seems to work pretty well.

iwintermute · on Nov 21, 2016

Not the original poster, but IIRC cloud watch improved a lot in recent months - before that web interface was kinda hard to use with all the log streams, right now it's much easier to search for some event for example (time brackets were added, auto-loading of previous event if I'm not mistaking)

pnathan · on Nov 22, 2016

Compared to, e.g., kibana, splunk, or `cat`, I find it effectively useless. The UI is ... ungood.

aleem · on Nov 20, 2016

It works remarkably well if you use a development toolchain (serverless and claudia are both good though I prefer the latter).

My only gripe is that API Gateway (APIG) charges $3.5 per million requests and there is no other way to invoke lambda functions over HTTP. Had to scale back my micro-service ambitions because of this.

hayd · on Nov 20, 2016

Hmmmm, I know it seems ridiculous but I wonder if it would it be cheaper to just spool up an ec2 machine and async invoke the lambda call e.g. via boto?

It seems like APIG has a boat load of features, which is great... but they are mostly unused.

hayd · on Nov 21, 2016

potentially I'm looking for something like this (to proxy_pass to a lambda function with nginx) https://github.com/washingtonpost/nginx-aws-lambda

realcoopernurse · on Nov 21, 2016

I wrote a AWS Lambda plugin for the Caddy web server which can take the place of API Gateway.

https://caddyserver.com/docs/awslambda

LePetitDev · on Nov 21, 2016

I use Lambda in production as a replacement for a cron server. So far, all it does is keep my cache warm. So far, no complaints. It was actually TJ Holowaychuk's blog posts on his use of Lambda that got me to give it a shot.