Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Fly’s Prometheus Metrics (fly.io)
164 points by elithrar on May 13, 2021 | hide | past | favorite | 79 comments


Alas, we’ve found the first thing I _dont_ like about fly. They’ve sadly bought into the TSDB model that everything is a counter instead of the more modern model that everything is a histogram.

Google has long since abandoned the Borgmon data model for histograms with monarch. The closest non google implementation is probably circonus. Sadly neither is available as open source software.

I can’t really blame fly for not individually building an open source modern metric db. But it’s sort of sad that the infra team I’m most impressed with has to use metric systems from 15 years ago when the rest of their stack is so cutting edge.


So, most shops (at least the ones that have been existing for a few years) are just starting their move to Prometheus (or in the middle of it) and we already have a newer, better implementation to migrate to? I love the tech world! (I'm half ironic and half serious with my last sentence)


Well yeah because Prometheus was inspired by Google's Borgmon and soon there will be another product inspired by Google's Monarch.

Google's SRE stack is lightyears ahead of anything else out there simply because they're at the scale where they can afford to hire dedicated developers to just write internal ops software where most other SREs are understaffed and overworked just on operational projects.



We do plenty of histograms! Vicky is good at them! Counters are just easier to write about.


Vicky is about the best case scenario for a counter based metrics db. I am routinely impressed by the things it accomplishes but monarch & irondb from circonus use histograms as their base data structure which means they avoid all the hacks that counter based tsdb have to deal with. The obvious one in the Prometheus stack being having to pre-declare your histogram bounds.


I'd also take Monarch's query language over PromQL any day.


Note there is work under way in Prometheus to make histograms more flexible:

https://static.sched.com/hosted_files/promcononline2021/33/2...


What would you recommend today as a self-hosted solution today?


Circonus’ Irondb is the only histogram db available outside of google that I know of.

They have a self hosted option but it’s not free or open source.


First time I've heard of Fly, went to check out their docs and found this awesome little mention:

> We use Lets Encrypt to issue certificates, and donate half of our SSL fees to them at the end of each calendar year.


We wrote a similarly tedious piece about how our certificate system works, if you're interested: https://fly.io/blog/how-cdns-generate-certificates/


Just for a bit of social recognition I love bumping into one of you folks' similarly tedious pieces!

Any chance of an RSS feed on that blog? [e: ah it's just missing the indicator meta tag thingy, https://fly.io/feed.xml]


Does anyone here have experience using Fly? I’ve seen a few of their posts and it seems quite nice.


I recently started using Fly to host the anycast DNS name servers for my DNS hosting service SlickDNS (https://www.slickdns.com/).

There were some initial teething issues as Fly only recently added UDP support, but they were very responsive to the various bugs I reported and fixed them. My name servers have been running problem free for several weeks now.

The Fly UX via the flyctl command-line app is excellent, very Heroku-like.

For apps that need anycast the only real alternative to Fly that I found is AWS Global Accelerator, but it limits you to two anycast IP addresses, it's much more expensive than Fly and you're fighting the AWS API the whole time.


You can get a reasonable idea of the problems people run into on our community forum. I use Fly for everything but that may not be the best signal for you: https://community.fly.io/


I recently setup my own http://logpaste.com to give them a try. Fly was quite easy to use.


Love Fly. Been using it for about a year now? We host Draftbit in a bunch of regions. The community is growing and they know their shit.


I use fly to host a few apps and its fantastic. Extremely responsive and it handling TLS offload is great from a performance perspective.

I find their billing dashboard leaves a bit to be desired, I'm still very nervous I'm going to get a big unexpected bill.


If you get a big surprise bill, email me and we'll fix it. We're not interested in surprising people with bills.


I discovered fly a month or two ago, and I've been incredibly impressed. The dev experience is really nice, and the team is super responsive.


It seems fairly expensive for what you get?


For multi region set up it’s very competitively priced IMO.

I have some pretty important services I run on there and my availability has beaten anything I’ve run on GCP.


For which resources?

I'm not a heavy cloud services user, so the prices aren't that important to me, but fly.io seems to be on par with other providers.

I think of Heroku as fly's most direct competitor. Heroku charges $50/mo for a dedicated VM with 1 GB RAM, whereas fly charges $31/mo for dedicated VM with 2 GB RAM.


> VictoriaMetrics

:(

Not actually Prometheus-compatible, sloppy code, spotty docs. I have no idea why this dumb product continues to attract users.

https://prometheus.io/blog/2021/05/04/prometheus-conformance...

> Telegraf, which is to metrics sort of what Logstash is to logs: a swiss-army knife tool that adapts arbitrary inputs to arbitrary output formats. We run Telegraf agents on our nodes to scrape local Prometheus sources, and Vicky scrapes Telegraf. Telegraf simplifies the networking for our metrics; it means Vicky (and our iptables rules) only need to know about one Prometheus endpoint per node.

Normally you just use a regular Prometheus server to do this. Why add another, different technology to the stack?

> We spent some time scaling it with Thanos, and Thanos was a lot, as far as ops hassle goes.

It really isn't -- assuming you're not trying to bend Prometheus into something it isn't. Prometheus works using a federated, pull-based architecture. It expects to be near the things it's monitoring, and expects you to build out a hierarchy of infrastructure, in layers, to handle larger scopes.

This is structurally different to what I'll call the "clustering" model of scale, where you have all your data sources pushing their data, aggregating maybe on the machine or datacenter level, but then shuttling everything to a single central place, which you scale vertically from the perspective of your users. This appears to be what you want to do, based on the prevalence of push-based tech in your stack.

Prometheus doesn't work this way. Some people really want it to work this way, and have even created entire product lines that make it look as if it works this way (Cortex, M3db) but it's fundamentally just not how it's designed to be used. If you try to make it work this way yourself, you'll certainly get frustrated.


> Normally you just use a regular Prometheus server to do this. Why add another, different technology to the stack?

Our physical hosts have hundreds of services exporting metrics. And many of those exported metrics are from untrusted sources. So we can both rewrite labels and decrease the scrape endpoint discoverability problem by aggregating them in one place.

> Not actually Prometheus-compatible, sloppy code, spotty docs. I have no idea why this dumb product continues to attract users.

Because it works incredibly well, it's easy to operate, and handles multi tenancy for us.


> Our physical hosts have hundreds of services exporting metrics. And many of those exported metrics are from untrusted sources. So we can both rewrite labels and decrease the scrape endpoint discoverability problem by aggregating them in one place.

OK, but Prometheus can do all of this just fine?

> Because it works incredibly well, it's easy to operate, and handles multi tenancy for us.

Again, Prometheus itself ticks all of these boxes, too, if you're not trying to force it to be something it's not.


We're not forcing Prometheus to be anything, since we're not using it. What Prometheus wants to be is not really a relevant constraint in our design space. A topologically simple, scalable, multi-tenant cluster that presents as just a giant bucket of metrics to our users is what we wanted, and we got it.

There's an interesting discussion to be had about how our infrastructure works; for example, in the abstract, I'd prefer a "pure" pull-based design too. But things appear and disappear on our network a lot, and remote write simplifies a lot of configuration for us, so I don't think it's going anywhere.

I think you're reading a critique of Prometheus that isn't really present in what we're writing. Prometheus is great! Everyone should use it! Our needs are weird, since we're handling metrics as a feature of a PAAS that we're building.


> I think you're reading a critique of Prometheus that isn't really present in what we're writing.

I'm observing that you've used pull-based, horizontally-scaled tools to build a push-based, vertically-scaled telemetry infrastructure. It can be made to work, sure, but the solution is an impedance mismatch to the problem.


I agree with you here. Using Prometheus, federated Prometheus, and Thanos on top of it for good measure, would probably get you better results without using a hodge-podge of non-Prometheus-compatible tools.


So, just so you understand where our heads are at: we want our users to light their apps up with lots of Prometheus metrics. Then we want them to be able to pop up a free-mode Grafana Cloud site, aim it at our API, add a dashboard, start typing a metric name and have it autocomplete to all possible metrics.

That pretty much works now?

I see the ideological purity case you two are making for "true Prometheus", but it is not at all clear to me how doing a purer version of Prometheus would make any of our users happier.

What am I missing?


> That pretty much works now?

Well, with the requisite glue code that would inform each user's Prometheus instance how to scrape the service instances -- yes, more or less.

> is not at all clear to me how doing a purer version of Prometheus would make any of our users happier.

If the only things you care about when you build systems are "works" and "direct impact on customers" then there's not really a point to this conversation. The things I'm speaking about, the architectural soundness of a distributed system, are largely orthogonal to those metrics, at least to the first derivative.


Oh, sorry, I misunderstood your meaning when you wrote "That pretty much works now?" — I thought it was a question as to whether a more traditional Prom architecture could do it, but I see now you're just saying you already have this set up.


Right. But also: I'm not trying to be dismissive. We both know that we're looking at this through different lenses. I'm genuinely curious how your lens could inform mine; like, is there something I'm missing? Where, by deploying a much more conventional Prometheus architecture, I could somehow make our users happier? I don't see it, but I'm a dummy; if there's something for me to learn, I'm happy to learn it.


I'm pretty confident that the end result would be simpler in an architectural sense (i.e. fewer components), it would be easier to understand and maintain, and it would behave both more predictably and more reliably.

But these are subjective claims! Not everyone thinks the same way!


How big is the fly eng team at this point? You all seem to be doing a ton, I’m always kind of surprised these posts don’t end with the usual “we’re hiring” blurb that’s become the norm on these sorts of tech posts.


We're small, only 7 people. That's on purpose (so far). The magic 8 ball says we'll grow in the next few months though.


You guys really punch above your weight. Like Bruce Lee. Here's hoping you grow slow and stay that way!


Do you plan to join the bandwidth alliance at cloudflare? Data prices are quite costly. It's the same as GCP/AWS.


Hadn't heard of promxy before. In the past to reduce cardinality/deduplicate metrics, I've just ran another instance of Prometheus entirely in-memory and used rewrite rules.

Exposing a metrics endpoint for customers is nice. How do you manage the cardinality? I haven't used Victoria before, is it just better at high cardinality time series?


That’s its promise, e.g. https://valyala.medium.com/high-cardinality-tsdb-benchmarks-...

FWIW, we are in the process of considering it as a single monitoring plane for multiple clusters with high cardinality metrics


Vicky isolates tenants into their own "shards" so we mostly don't have to worry about it. It's also pretty good at high cardinality.


They mentionned that they decided against thanos for the storage of metrics, but would be curious to hear if other TSDB were considered. It is a hot space, I know about M3BD, Clickhouse, Timescale, influx, QuestDB, opentsdb, etc.


We kind of naturally got from Prometheus -> VictoriaMetrics. Vicky is very simple to operate, which was a win. I'm a huge fan of Timescale, though, and we have big Postgres plans that I hope include Timescale. :)


> When it comes to automated monitoring, there are two big philosophies, “checks” and “metrics”.

There's a third, "events". Just push an event out whenever something interesting happens, and let the monitoring tool decide whether to count, aggregate, histogram, alert, etc.

Events require less code in the app (no storage, no aggregation, no web server), and allow more flexibility in processing. I have used events to great effect. I am baffled as to why monitoring people still only talk about metrics.


> If you’re an Advent of Code kind of person and you haven’t already written a parser for this format, you’re probably starting to feel a twitch in your left eyelid. Go ahead and write the parser; it’ll take you 15 minutes and the world can’t have too many implementations of this exposition format. There's a lesson about virality among programmers buried in here somewhere.

Huh? Who gets excited about writing a parser?

What was wrong with "${key} ${value}" on separate lines?


Is "excited" the word? I don't know. I think "obsessively compelled" is more what I was going for. It's one of those formats where --- despite apparently being originally intended for human consumption --- you can immediately see the rule of construction for, like you're not just reading the data, but also the pseudocode for how it's formatted.


Having written a lot of parsers, I really loved how you put a newline in a "keyword"-type thing, just to keep it interesting :-)


I love writing one liner parsers in Elixir when I’m bored. Pipes and pattern matching for the win!!


Well I suppose what's wrong with it is that these metrics are more complicated than key=>value. The metric referred to in your quote contains a metric name, multiple attributes (status, app ID etc) and a value.


Whose the target audience for fly?

I’m trying to understand the market they’re operating in. Big ol enterprises would probably want to run on AWS/GCP right? So would startups? What’s the long game? Genuine question.


It all starts with a developer at BigCo who is frustrated with the complexity of LegacyCo’s stack... so she build it on NewCo’s coolness. And then suddenly BigCo wants enterprise features from NewCo, and so it goes...


I really like those diagrams. What tools/software is used to generate?


excalidraw.com

It's pretty great, we're happy plus subscribers.


I'm not quite sure if fly is production ready for our usecase yet, but it does look awesome though.


It depends entirely on your stack + use case! If you feel like emailing I can give you an honest take. :)


Thanos is a lot of ops, true. But did they try Cortex? Oh, and M3?


We looked at Cortex but running a Cassandra cluster was a little too much for us. I don't think we saw M3 until after we'd started using VictoriaMetrics.


Cortex doesn’t need a NoSQL database now, since about a year ago. It does need something S3-like.

(I am a Cortex maintainer)


Oh that's a great improvement.


Not to pick on Fly (seems nice), but on the trend for containers:

>if you’ve got a Docker container, it can be running on Fly in single-digit minutes.

I used to laugh at the old Plan 9 fortune, "... Forking an allegro process requires only seconds... -V. Kelly". Guess I'm not laughing anymore?

FWIW, performance of components is the barrier to composition in system design and development. You can't compose modules that take seconds to act, and still have something that is usable real-time.


Fly runs the contents of the container in a VM. Bring your own runtime in that VM, use kernel sandboxing features, do whatever you want (in linux).

I’m all for the “k8s is not erlang” position, but in fly’s case it seems like the right tool for the job. Fly is much faster than a new EC2 instance.


More info about this in the blogpost Docker without Docker[0]

[0] https://fly.io/blog/docker-without-docker/


You’re remarking on a development time number. Like, the amount of time it’ll take to create your account.


I can see how someone would misunderstand that claim though: the idea of getting up and running on a new hosting provider that quickly seems so unlikely that thinking "well they must be talking about container launch times here instead" is an understandable mistake.


Container launch times are frequently in minutes if we include pulling the images from a repo, which we reasonably should.

In whichever case, that doesn't mean we can't compose them into a performant product as the OP suggests, it just means you run them as daemons so you can amortize the startup cost across many invocations. This isn't specific to containers--we do the same thing for web servers, databases, virtual machines, physical machines etc. Anything with a startup cost that you don't want to pay each time.


Why would you include pulling the container image as something that should be a part of the container launch time? It’s a one time event after a code push.


It's not a one time event, each host that runs a container image needs to pull it at least once. Minimizing pulls is a good optimization but you have to work pretty hard to really cut them down.


fwiw, I moved my prod env from a VPS (at linode) to fly.io twice in one week. Once to try it out and get a feel for what I was getting into and the second time for real. Took about 90 minutes each (plus some thinking time and doc reading). It's a small app without a ton of data to move nor a lot of traffic at this point, so take my experience with a grain of salt.


Oh, you’re right; I’m just clarifying. The perils of trying to fit your whole project into a tiny lead paragraph. :)


"Create your first project in minutes. Create your next project in under 25 seconds."


You missed the mark, mate. The "single-digit minutes" applies to the one-time onboarding process, not cold-start times for a deployed service!


> Fly.io transforms container images into fleets of micro-VMs running around the world on our hardware.

Oh boy!

> None of us have ever worked for Google, let alone as SREs. So we’re going out on a limb

Oh.... boy.

> We spent some time scaling it with Thanos, and Thanos was a lot, as far as ops hassle goes.

You know, they have these companies now, that will collect your metrics for you, so that you don't have to deal with ops hassle.

> In each Firecracker instance, we run our custom init,

... in Rust. Yes, the thing that is normally a shell script, is now a compiled program in a new language, that mostly just runs mkdir(), mount() and ethtool(). (https://github.com/superfly/init-snapshot/blob/public/src/bi...). In a few years, when that component is passed off to a dedicated Ops team, and they find it hard to hire a sysadmin who also knows Rust, there will be some poor intern who learned Rust over the summer whose job is to rewrite that thing back into a shell script.


Now I like Fly even more. I used to be an SRE for AMZN and I 100% on board with replacing shell scripts and init with types configuration files (like Dhall for example) and a system that is written in Rust for parsing and understanding those files. I think this is the best part of SystemD. There are other projects like S6 for example. I need to understand more about Fly's implementation of init, I am very curious.


It sounds like you're saying that it would be better if our init were literally a bash script. Is that what you're saying?


Not literally, no, as you need an init program that performs certain steps first which bash can't do. But once those few operations are done you can pass execution off to a script.

It looks like your init is running the gamut of typical linux init steps: handling signals, spawning processes, handling TTYs, running typical system start-up commands. Then it gathers host and network metrics, reports results, and operates as some kind of websocket server? All of that other stuff should live in a separate dedicated program that the init script runs. Unless I missed something and that's not possible?


We can't assume any particular environment inside a VM; we don't own the "distribution", or really anything other than init itself, including libraries. We could make something work, but it'd be more effort than what we do now, which is just some trivial straight-line Rust code, in an init that was going to be Rust anyways.

We wrote a little more about this here:

https://fly.io/blog/docker-without-docker/


Yeah, so you'd use an initrd as "your" environment, run your programs, pivot_root (or the modern equivalent) and then pass execution to your guest's environment, which can have its own init.

I say it will eventually be rewritten as a shell script (minus the stats, which will be replaced by a monitoring agent) because people messing with embedded Linux often try writing their own compiled init to bundle with an initrd. It eventually becomes a hassle, so they either make their own feature-filled init replacement, or they go back to a shell script. Most go for the shell script.

I mean, it's a DSL for composing programs. It's too powerful for what you need to do, but all its other advantages (portability, flexibility, simplicity, tracing, environment passing, universal language, rapid development, cheaper support, yadda yadda) make it a win over time. The only reason I can see not to use a shell script is if speed is your highest priority.


I like shell scripts too. I once wrote a shell script that converted X.509 certificates into shell scripts that generated the same X.509 certificate, field by field. Shell scripts are great. Not the right tool here, but, great.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: