I wrote the apex(1) tool and created https://apex.sh/ping/ with Lambda, in general it has been great, has scaled flawlessly since launch (granted I'm only doing ~8M requests / day).
Conceptually I think it's great for pipelines or use-cases like this, VMs are generally a terrible level of abstraction for a lot of problems, and the Lambda style promotes better architecture because of this.
The connectivity between Kinesis/SNS and friends is great. I'd agree that Lambda is not currently a good fit for "regular" apps, APIs should be fine now that the proxy stuff is in there, though there's slight latency.
No need to worry about gracefully stopping or restarting daemons, just push new code and the old stuff goes away, it really is a great abstraction that way. Basically replace anything you'd use a Go channel for, with more Lambda or SNS->Lambda if you need retries and backoff, it'll spare you a lot of code.
I find the workflow great as well, the slowest part for me is compiling the Go binaries, the rest is virtually instant. Especially now with all this needlessly complex Docker stuff it's refreshing to use something simple.
Cost is prohibitive for sustained use, so make sure you price things out properly, it sounds very cheap until you look at say a constant 100 requests/s behind API Gateway. It's easily 300-400% what you'd pay on EC2.
Cold-start is really a non-issue in most cases, it seems to take very little to keep a function warmed, so unless you get zero traffic (which would be dirt cheap on a t2.micro anyway) you'll be fine.
Had never heard of apex. Looks really nice and giving it a try now!
Edit: Feedback: Really like it. One feature that would be very nice would be the ability to trigger a ping on demand, e.g., for testing auth set up on a request. Runscope implements a similar feature. Otherwise great so far!
Thanks! Agreed, I have that on the list, quick sanity check is always good. Taking a bit of a break to work on other products but I'll keep adding to it.
I've looked at Apex a few times. I was going to ask you for a hacker plan but it sounds like you're already doing such numbers it might not be justified. It would be really nice for me to be able to have a master subscription for my contracting work for client projects.
I've thought about adding a smaller plan, I still might at some point but it certainly reaches a level where it's not really worth it, especially since I want to provide equal support to everyone. I'll have to experiment with that.
I had limited free plans originally but that went horribly wrong haha, free users only attract other free users, a few days later I had like 4000 free people. Maybe that works for startups, but not "real" companies.
If you need an API Gateway that can work with Lambda, but with better performance and feature set, take a look at https://github.com/Mashape/kong - there is an open PR for Lambda support which will land in the next version (I am one of the core committers).
This is very interesting. Especially when you have a large volume of trivial requests, the cost-per-request of AWS API Gateway dominates overall cost (the Lambda functions themselves are cheap by comparison). Rolling one's own in EC2 can potentially be much cheaper.
I haven't tested it extensively but if you boot up API Gateway and a hello-world Lambda function, the function itself takes maybe 1ms to run, while API Gateway seems to add roughly 150-200ms on top of that.
We gave it a run in prod and abandoned it quickly. We were using it for parallel file processing (S3->lambda for CPU-heavy task->return a few numbers in a JSON).
Last I calculated, it's nearly 5x the cost of comparable t2 on-demand EC2 instances. That can be mitigated if you have spiky traffic where you'd need to turn on several EC2 instances for less than an hour but are stuck paying for the full hour, or if you can scale to zero for extended periods of time.
Critical to us: you don't have the ability to serve concurrent requests from a single lambda worker, so if you're waiting for async IO (e.g. downloading from S3) then you're wasting money you wouldn't be wasting with EC2. You end up with one file per lambda worker, whereas we could handle about five files concurrently on a normal VM because of async IO. -- For us lambda was prohibitively expensive at scale. I suspect a fair number of people reporting significant savings vs EC2 had over-provisioned instances.
It would be interesting if someone made an EC2 image that implemented the Lambda APIs, and handled as many requests as it could, then optionally delegated the overflow to real Lambda. Like "reserved" Lambdas.
This is a _fascinating_ idea. You could make an image that queried the Lambda API, downloaded all Lambda functions (and their configuration) and served up exactly those functions. Make it completely turnkey.
You might want to check out IronFunctions that was released last week: https://github.com/iron-io/functions . You can run Lambda functions anywhere, can even export/import them directly from Lambda. It doesn't have the burst to Lambda part you speak of though... yet.
TLDR: AWS Lambda is awesome if you have indeterminate-but-high amounts of small CPU-bound tasks you want to be able to do in parallel, as soon as they are needed. Awesome for file processing (eg. image resizing) for example.
But being stuck on Python 2.7 sucks. PLEASE Amazon, announce Python 3 support already.
Using Python for an API call that is basically a thin wrapper around one of our supplier's (not very good) APIs running ~500k requests per day. We've had problems twice with 500 errors for all requests at night when they do maintenance on the API gateway/lambda hardware and don't tell us (in both cases they announced new features the next morning). We reported it to support both times and got "we're sorry, but since it's all good now there is nothing to do" as the response both times. Left a bad taste in our mouth and so we haven't deployed anything new to it and probably won't for a while despite it working well otherwise.
We tried about a year ago and gave up after too much wasted effort on getting any kind of a good dev/CI/CD pipeline going. Mainly due to lack of tooling. Things like environment variables, etc.
Taking another look now on some side projects, to get a sense of if all the of the issues have been addressed.
But the big ones were lack of env variables, and the tooling was atrocious.
I really like the concept, and the current serverless (serverless.com) framework has been simplified, runs based on cloud formation, and also allows the project to be run as an express server, and potentially with additional plugins on a competing serverless architecture.
Don't use it if you require low latency - performance is abysmal and the bottleneck is Lambda overhead that you cannot influence.
Also, if you don't have a high traffic function, cold start is also an issue, at least when using Java. There is a "hack" to avoid this issue: create a cloudwatch check that triggers your function every minute or so. This sounds weird but if you consider that this is good practice, this gives you an overview how production-ready Lambda is. Nice thing is that is's pretty cheap though.
Warming it every minute seems kind of excessive. Did you really have to do it that much? I haven't experienced this with Python but my jobs are more cron-ish so an extra 100 ms doesn't make a difference.
Loading JVM from a cold VM is a lot more than 100ms, more like 5-10 seconds. Same problem Google App Engine had 7 years ago before you could pay to keep an instance warm.
I've used Lambda in production under pretty high throughput conditions before. It's fairly stable – given that we were using Kinesis to feed into it (which was a pain to setup).
Recently, I've been using stdlib (https://stdlib.com) to do function-services since it's a lot easier to get started with it a bit more intuitive for newcomers to understand.
We use Lambda and API gateway for all production[1] requests. HTML, Server side rendered JS, API calls. All through lambda. There are some rough edges. We wrote our own framework/deployment tool[2] to fix some of them. AWS has been making lots of improvements too. Many in just the last few months. This setup was much harder to run when we first started a year and a half ago. Overall our team is super happy. It is cheaper and simpler to operate than our previous ec2+Opsworks setup. We get code to production faster and spend more time on actual business problems vs infastructure problems. I also pretty much agree with everything Ben Kehoe said in this article: https://serverless.zone/serverless-vs-paas-and-docker-three-....
[1] www.bustle.com and www.romper.com do a combined XX million unique visitors per month.
Mostly it is not a big problem for us. Our functions that need to be fast have very high traffic. Our other functions don't need to be that fast. We also have a CDN in front of most things.
That said, we use node. I have heard cold starts are worse for Python (I have no data). I have heard they are near unbearable for Java (JVM startup and all that). We try and minimize external calls outside the main handler. Previous to this update we were packing configs with webpack at deploy time. Some people read them from S3 or dynamo. I don't consider that a good solution. I'll be working this week on Shep 3.0 which will use the new ENV var features.
I have talked to other people who do various hacks to force their functions to be warm. I've looked at doing this and haven't found it necessary yet for our use case.
I started with a manually-packaged Python function and credstash [1] to store credentials for non-AWS services. It gets invoked nightly by CloudWatch and has been bulletproof for months. I recently built a microservice [2] for Election Day with Flask and deployed it behind API Gateway with Zappa [3]. Lambda isn't for every use case, but for those it is, I'm bullish.
You can check out my talk at AWS E-Business Day. For ~50 minutes we are talking how we switched to serverless architectures for one of the largest online retailers of Europe and our learnings.
I use it for one of my clients to send an average of 400k browser push notifications, competitors send then in 40 minutes. With Lambda + kinesis firehose(16 shards) my solution is able to send them in less than 4 minutes.
Same here, deployed two functions. One thumbnail converter after s3 uploads which is used a couple of times per day, running for over a year. The other is a log aggregation script that runs thousands of times per hour, running for a few months now. Both run flawlessly, never had to worry about them after deploying.
It has been good for us, although our use case has mainly been ETL pipelines which seems quite a good fit, especially when you connect it up to Kinesis Firehose/Kinesis or S3 notifications.
The main issue we've seen is, even at the "top" tier (1536mb) the CPU performance doesn't seem very good, it's difficult to tell as they abstract you away from what is actually going on under the hood.
Main advantages: easy to deploy, no infrastructure worries if your problem fits, event based triggers means you don't have to write any of that code, fits well in the AWS ecosystem.
I've seen people making serverless websites and things like that, but I'm not convinced Lambda is a good fit for such a thing (happy to be proven wrong)
Not the original poster, but IIRC cloud watch improved a lot in recent months - before that web interface was kinda hard to use with all the log streams, right now it's much easier to search for some event for example (time brackets were added, auto-loading of previous event if I'm not mistaking)
It works remarkably well if you use a development toolchain (serverless and claudia are both good though I prefer the latter).
My only gripe is that API Gateway (APIG) charges $3.5 per million requests and there is no other way to invoke lambda functions over HTTP. Had to scale back my micro-service ambitions because of this.
Hmmmm, I know it seems ridiculous but I wonder if it would it be cheaper to just spool up an ec2 machine and async invoke the lambda call e.g. via boto?
It seems like APIG has a boat load of features, which is great... but they are mostly unused.
I use Lambda in production as a replacement for a cron server. So far, all it does is keep my cache warm. So far, no complaints. It was actually TJ Holowaychuk's blog posts on his use of Lambda that got me to give it a shot.