Fly’s Prometheus Metrics

kasey_junk · on May 14, 2021

Alas, we’ve found the first thing I _dont_ like about fly. They’ve sadly bought into the TSDB model that everything is a counter instead of the more modern model that everything is a histogram.

Google has long since abandoned the Borgmon data model for histograms with monarch. The closest non google implementation is probably circonus. Sadly neither is available as open source software.

I can’t really blame fly for not individually building an open source modern metric db. But it’s sort of sad that the infra team I’m most impressed with has to use metric systems from 15 years ago when the rest of their stack is so cutting edge.

darkwater · on May 14, 2021

So, most shops (at least the ones that have been existing for a few years) are just starting their move to Prometheus (or in the middle of it) and we already have a newer, better implementation to migrate to? I love the tech world! (I'm half ironic and half serious with my last sentence)

Spivak · on May 14, 2021

Well yeah because Prometheus was inspired by Google's Borgmon and soon there will be another product inspired by Google's Monarch.

Google's SRE stack is lightyears ahead of anything else out there simply because they're at the scale where they can afford to hire dedicated developers to just write internal ops software where most other SREs are understaffed and overworked just on operational projects.

StreamBright · on May 14, 2021

For those who are curious and not familiar with Monarch:

https://www.youtube.com/watch?v=2mw12B7W7RI

https://www.vldb.org/pvldb/vol13/p3181-adams.pdf

tptacek · on May 14, 2021

We do plenty of histograms! Vicky is good at them! Counters are just easier to write about.

kasey_junk · on May 14, 2021

Vicky is about the best case scenario for a counter based metrics db. I am routinely impressed by the things it accomplishes but monarch & irondb from circonus use histograms as their base data structure which means they avoid all the hacks that counter based tsdb have to deal with. The obvious one in the Prometheus stack being having to pre-declare your histogram bounds.

tptacek · on May 14, 2021

I'd also take Monarch's query language over PromQL any day.

bboreham · on May 14, 2021

Note there is work under way in Prometheus to make histograms more flexible:

https://static.sched.com/hosted_files/promcononline2021/33/2...

polskibus · on May 14, 2021

What would you recommend today as a self-hosted solution today?

kasey_junk · on May 14, 2021

Circonus’ Irondb is the only histogram db available outside of google that I know of.

They have a self hosted option but it’s not free or open source.

krono · on May 13, 2021

First time I've heard of Fly, went to check out their docs and found this awesome little mention:

> We use Lets Encrypt to issue certificates, and donate half of our SSL fees to them at the end of each calendar year.

tptacek · on May 13, 2021

We wrote a similarly tedious piece about how our certificate system works, if you're interested: https://fly.io/blog/how-cdns-generate-certificates/

corobo · on May 14, 2021

Just for a bit of social recognition I love bumping into one of you folks' similarly tedious pieces!

Any chance of an RSS feed on that blog? [e: ah it's just missing the indicator meta tag thingy, https://fly.io/feed.xml]

hstaab · on May 13, 2021

Does anyone here have experience using Fly? I’ve seen a few of their posts and it seems quite nice.

jbarham · on May 14, 2021

I recently started using Fly to host the anycast DNS name servers for my DNS hosting service SlickDNS (https://www.slickdns.com/).

There were some initial teething issues as Fly only recently added UDP support, but they were very responsive to the various bugs I reported and fixed them. My name servers have been running problem free for several weeks now.

The Fly UX via the flyctl command-line app is excellent, very Heroku-like.

For apps that need anycast the only real alternative to Fly that I found is AWS Global Accelerator, but it limits you to two anycast IP addresses, it's much more expensive than Fly and you're fighting the AWS API the whole time.

mrkurt · on May 13, 2021

You can get a reasonable idea of the problems people run into on our community forum. I use Fly for everything but that may not be the best signal for you: https://community.fly.io/

wferrell · on May 13, 2021

I recently setup my own http://logpaste.com to give them a try. Fly was quite easy to use.

thegreatpeter · on May 14, 2021

Love Fly. Been using it for about a year now? We host Draftbit in a bunch of regions. The community is growing and they know their shit.

itsjloh · on May 14, 2021

I use fly to host a few apps and its fantastic. Extremely responsive and it handling TLS offload is great from a performance perspective.

I find their billing dashboard leaves a bit to be desired, I'm still very nervous I'm going to get a big unexpected bill.

mrkurt · on May 14, 2021

If you get a big surprise bill, email me and we'll fix it. We're not interested in surprising people with bills.

mtlynch · on May 13, 2021

I discovered fly a month or two ago, and I've been incredibly impressed. The dev experience is really nice, and the team is super responsive.

Aeolun · on May 13, 2021

It seems fairly expensive for what you get?

austinpena · on May 14, 2021

For multi region set up it’s very competitively priced IMO.

I have some pretty important services I run on there and my availability has beaten anything I’ve run on GCP.

mtlynch · on May 14, 2021

For which resources?

I'm not a heavy cloud services user, so the prices aren't that important to me, but fly.io seems to be on par with other providers.

I think of Heroku as fly's most direct competitor. Heroku charges $50/mo for a dedicated VM with 1 GB RAM, whereas fly charges $31/mo for dedicated VM with 2 GB RAM.

sagichmal · on May 13, 2021

> VictoriaMetrics

:(

Not actually Prometheus-compatible, sloppy code, spotty docs. I have no idea why this dumb product continues to attract users.

https://prometheus.io/blog/2021/05/04/prometheus-conformance...

> Telegraf, which is to metrics sort of what Logstash is to logs: a swiss-army knife tool that adapts arbitrary inputs to arbitrary output formats. We run Telegraf agents on our nodes to scrape local Prometheus sources, and Vicky scrapes Telegraf. Telegraf simplifies the networking for our metrics; it means Vicky (and our iptables rules) only need to know about one Prometheus endpoint per node.

Normally you just use a regular Prometheus server to do this. Why add another, different technology to the stack?

> We spent some time scaling it with Thanos, and Thanos was a lot, as far as ops hassle goes.

It really isn't -- assuming you're not trying to bend Prometheus into something it isn't. Prometheus works using a federated, pull-based architecture. It expects to be near the things it's monitoring, and expects you to build out a hierarchy of infrastructure, in layers, to handle larger scopes.

This is structurally different to what I'll call the "clustering" model of scale, where you have all your data sources pushing their data, aggregating maybe on the machine or datacenter level, but then shuttling everything to a single central place, which you scale vertically from the perspective of your users. This appears to be what you want to do, based on the prevalence of push-based tech in your stack.

Prometheus doesn't work this way. Some people really want it to work this way, and have even created entire product lines that make it look as if it works this way (Cortex, M3db) but it's fundamentally just not how it's designed to be used. If you try to make it work this way yourself, you'll certainly get frustrated.

mrkurt · on May 14, 2021

> Normally you just use a regular Prometheus server to do this. Why add another, different technology to the stack?

Our physical hosts have hundreds of services exporting metrics. And many of those exported metrics are from untrusted sources. So we can both rewrite labels and decrease the scrape endpoint discoverability problem by aggregating them in one place.

> Not actually Prometheus-compatible, sloppy code, spotty docs. I have no idea why this dumb product continues to attract users.

Because it works incredibly well, it's easy to operate, and handles multi tenancy for us.

sagichmal · on May 14, 2021

> Our physical hosts have hundreds of services exporting metrics. And many of those exported metrics are from untrusted sources. So we can both rewrite labels and decrease the scrape endpoint discoverability problem by aggregating them in one place.

OK, but Prometheus can do all of this just fine?

> Because it works incredibly well, it's easy to operate, and handles multi tenancy for us.

Again, Prometheus itself ticks all of these boxes, too, if you're not trying to force it to be something it's not.

tptacek · on May 14, 2021

We're not forcing Prometheus to be anything, since we're not using it. What Prometheus wants to be is not really a relevant constraint in our design space. A topologically simple, scalable, multi-tenant cluster that presents as just a giant bucket of metrics to our users is what we wanted, and we got it.

There's an interesting discussion to be had about how our infrastructure works; for example, in the abstract, I'd prefer a "pure" pull-based design too. But things appear and disappear on our network a lot, and remote write simplifies a lot of configuration for us, so I don't think it's going anywhere.

I think you're reading a critique of Prometheus that isn't really present in what we're writing. Prometheus is great! Everyone should use it! Our needs are weird, since we're handling metrics as a feature of a PAAS that we're building.

sagichmal · on May 14, 2021

> I think you're reading a critique of Prometheus that isn't really present in what we're writing.

I'm observing that you've used pull-based, horizontally-scaled tools to build a push-based, vertically-scaled telemetry infrastructure. It can be made to work, sure, but the solution is an impedance mismatch to the problem.

sevagh · on May 14, 2021

I agree with you here. Using Prometheus, federated Prometheus, and Thanos on top of it for good measure, would probably get you better results without using a hodge-podge of non-Prometheus-compatible tools.

tptacek · on May 14, 2021

So, just so you understand where our heads are at: we want our users to light their apps up with lots of Prometheus metrics. Then we want them to be able to pop up a free-mode Grafana Cloud site, aim it at our API, add a dashboard, start typing a metric name and have it autocomplete to all possible metrics.

That pretty much works now?

I see the ideological purity case you two are making for "true Prometheus", but it is not at all clear to me how doing a purer version of Prometheus would make any of our users happier.

What am I missing?

sagichmal · on May 15, 2021

> That pretty much works now?

Well, with the requisite glue code that would inform each user's Prometheus instance how to scrape the service instances -- yes, more or less.

> is not at all clear to me how doing a purer version of Prometheus would make any of our users happier.

If the only things you care about when you build systems are "works" and "direct impact on customers" then there's not really a point to this conversation. The things I'm speaking about, the architectural soundness of a distributed system, are largely orthogonal to those metrics, at least to the first derivative.

sagichmal · on May 15, 2021

Oh, sorry, I misunderstood your meaning when you wrote "That pretty much works now?" — I thought it was a question as to whether a more traditional Prom architecture could do it, but I see now you're just saying you already have this set up.

tptacek · on May 15, 2021

Right. But also: I'm not trying to be dismissive. We both know that we're looking at this through different lenses. I'm genuinely curious how your lens could inform mine; like, is there something I'm missing? Where, by deploying a much more conventional Prometheus architecture, I could somehow make our users happier? I don't see it, but I'm a dummy; if there's something for me to learn, I'm happy to learn it.

sagichmal · on May 17, 2021

I'm pretty confident that the end result would be simpler in an architectural sense (i.e. fewer components), it would be easier to understand and maintain, and it would behave both more predictably and more reliably.

But these are subjective claims! Not everyone thinks the same way!

ryanschneider · on May 13, 2021

How big is the fly eng team at this point? You all seem to be doing a ton, I’m always kind of surprised these posts don’t end with the usual “we’re hiring” blurb that’s become the norm on these sorts of tech posts.

mrkurt · on May 13, 2021

We're small, only 7 people. That's on purpose (so far). The magic 8 ball says we'll grow in the next few months though.

chrisweekly · on May 13, 2021

You guys really punch above your weight. Like Bruce Lee. Here's hoping you grow slow and stay that way!

xfer · on May 14, 2021

Do you plan to join the bandwidth alliance at cloudflare? Data prices are quite costly. It's the same as GCP/AWS.

jzelinskie · on May 14, 2021

Hadn't heard of promxy before. In the past to reduce cardinality/deduplicate metrics, I've just ran another instance of Prometheus entirely in-memory and used rewrite rules.

Exposing a metrics endpoint for customers is nice. How do you manage the cardinality? I haven't used Victoria before, is it just better at high cardinality time series?

uaas · on May 14, 2021

That’s its promise, e.g. https://valyala.medium.com/high-cardinality-tsdb-benchmarks-...

FWIW, we are in the process of considering it as a single monitoring plane for multiple clusters with high cardinality metrics

mrkurt · on May 14, 2021

Vicky isolates tenants into their own "shards" so we mostly don't have to worry about it. It's also pretty good at high cardinality.

Sytten · on May 13, 2021

They mentionned that they decided against thanos for the storage of metrics, but would be curious to hear if other TSDB were considered. It is a hot space, I know about M3BD, Clickhouse, Timescale, influx, QuestDB, opentsdb, etc.

mrkurt · on May 13, 2021

We kind of naturally got from Prometheus -> VictoriaMetrics. Vicky is very simple to operate, which was a win. I'm a huge fan of Timescale, though, and we have big Postgres plans that I hope include Timescale. :)

twic · on May 14, 2021

> When it comes to automated monitoring, there are two big philosophies, “checks” and “metrics”.

There's a third, "events". Just push an event out whenever something interesting happens, and let the monitoring tool decide whether to count, aggregate, histogram, alert, etc.

Events require less code in the app (no storage, no aggregation, no web server), and allow more flexibility in processing. I have used events to great effect. I am baffled as to why monitoring people still only talk about metrics.

tantalor · on May 14, 2021

> If you’re an Advent of Code kind of person and you haven’t already written a parser for this format, you’re probably starting to feel a twitch in your left eyelid. Go ahead and write the parser; it’ll take you 15 minutes and the world can’t have too many implementations of this exposition format. There's a lesson about virality among programmers buried in here somewhere.

Huh? Who gets excited about writing a parser?

What was wrong with "${key} ${value}" on separate lines?

tptacek · on May 14, 2021

Is "excited" the word? I don't know. I think "obsessively compelled" is more what I was going for. It's one of those formats where --- despite apparently being originally intended for human consumption --- you can immediately see the rule of construction for, like you're not just reading the data, but also the pseudocode for how it's formatted.

chris_st · on May 14, 2021

Having written a lot of parsers, I really loved how you put a newline in a "keyword"-type thing, just to keep it interesting :-)

devoutsalsa · on May 14, 2021

I love writing one liner parsers in Elixir when I’m bored. Pipes and pattern matching for the win!!

mdoms · on May 14, 2021

Well I suppose what's wrong with it is that these metrics are more complicated than key=>value. The metric referred to in your quote contains a metric name, multiple attributes (status, app ID etc) and a value.

pm90 · on May 14, 2021

Whose the target audience for fly?

I’m trying to understand the market they’re operating in. Big ol enterprises would probably want to run on AWS/GCP right? So would startups? What’s the long game? Genuine question.

ttul · on May 14, 2021

It all starts with a developer at BigCo who is frustrated with the complexity of LegacyCo’s stack... so she build it on NewCo’s coolness. And then suddenly BigCo wants enterprise features from NewCo, and so it goes...

vira28 · on May 14, 2021

I really like those diagrams. What tools/software is used to generate?

mrkurt · on May 14, 2021

excalidraw.com

It's pretty great, we're happy plus subscribers.

yannoninator · on May 13, 2021

I'm not quite sure if fly is production ready for our usecase yet, but it does look awesome though.

mrkurt · on May 13, 2021

It depends entirely on your stack + use case! If you feel like emailing I can give you an honest take. :)

jimmyed · on May 14, 2021

Thanos is a lot of ops, true. But did they try Cortex? Oh, and M3?

mrkurt · on May 14, 2021

We looked at Cortex but running a Cassandra cluster was a little too much for us. I don't think we saw M3 until after we'd started using VictoriaMetrics.

bboreham · on May 14, 2021

Cortex doesn’t need a NoSQL database now, since about a year ago. It does need something S3-like.

(I am a Cortex maintainer)

mrkurt · on May 14, 2021

Oh that's a great improvement.

dexen · on May 13, 2021

Not to pick on Fly (seems nice), but on the trend for containers:

>if you’ve got a Docker container, it can be running on Fly in single-digit minutes.

I used to laugh at the old Plan 9 fortune, "... Forking an allegro process requires only seconds... -V. Kelly". Guess I'm not laughing anymore?

FWIW, performance of components is the barrier to composition in system design and development. You can't compose modules that take seconds to act, and still have something that is usable real-time.

maxmcd · on May 13, 2021

Fly runs the contents of the container in a VM. Bring your own runtime in that VM, use kernel sandboxing features, do whatever you want (in linux).

I’m all for the “k8s is not erlang” position, but in fly’s case it seems like the right tool for the job. Fly is much faster than a new EC2 instance.

n3mes1s · on May 13, 2021

More info about this in the blogpost Docker without Docker[0]

[0] https://fly.io/blog/docker-without-docker/

tptacek · on May 13, 2021

You’re remarking on a development time number. Like, the amount of time it’ll take to create your account.

simonw · on May 13, 2021

I can see how someone would misunderstand that claim though: the idea of getting up and running on a new hosting provider that quickly seems so unlikely that thinking "well they must be talking about container launch times here instead" is an understandable mistake.

throwaway894345 · on May 13, 2021

Container launch times are frequently in minutes if we include pulling the images from a repo, which we reasonably should.

In whichever case, that doesn't mean we can't compose them into a performant product as the OP suggests, it just means you run them as daemons so you can amortize the startup cost across many invocations. This isn't specific to containers--we do the same thing for web servers, databases, virtual machines, physical machines etc. Anything with a startup cost that you don't want to pay each time.

pm90 · on May 14, 2021

Why would you include pulling the container image as something that should be a part of the container launch time? It’s a one time event after a code push.

mrkurt · on May 14, 2021

It's not a one time event, each host that runs a container image needs to pull it at least once. Minimizing pulls is a good optimization but you have to work pretty hard to really cut them down.

vhodges · on May 13, 2021

fwiw, I moved my prod env from a VPS (at linode) to fly.io twice in one week. Once to try it out and get a feel for what I was getting into and the second time for real. Took about 90 minutes each (plus some thinking time and doc reading). It's a small app without a ton of data to move nor a lot of traffic at this point, so take my experience with a grain of salt.

tptacek · on May 13, 2021

Oh, you’re right; I’m just clarifying. The perils of trying to fit your whole project into a tiny lead paragraph. :)

lsb · on May 13, 2021

"Create your first project in minutes. Create your next project in under 25 seconds."

chrisweekly · on May 13, 2021

You missed the mark, mate. The "single-digit minutes" applies to the one-time onboarding process, not cold-start times for a deployed service!

0xbadcafebee · on May 14, 2021

> Fly.io transforms container images into fleets of micro-VMs running around the world on our hardware.

Oh boy!

> None of us have ever worked for Google, let alone as SREs. So we’re going out on a limb

Oh.... boy.

> We spent some time scaling it with Thanos, and Thanos was a lot, as far as ops hassle goes.

You know, they have these companies now, that will collect your metrics for you, so that you don't have to deal with ops hassle.

> In each Firecracker instance, we run our custom init,

... in Rust. Yes, the thing that is normally a shell script, is now a compiled program in a new language, that mostly just runs mkdir(), mount() and ethtool(). (https://github.com/superfly/init-snapshot/blob/public/src/bi...). In a few years, when that component is passed off to a dedicated Ops team, and they find it hard to hire a sysadmin who also knows Rust, there will be some poor intern who learned Rust over the summer whose job is to rewrite that thing back into a shell script.

StreamBright · on May 14, 2021

Now I like Fly even more. I used to be an SRE for AMZN and I 100% on board with replacing shell scripts and init with types configuration files (like Dhall for example) and a system that is written in Rust for parsing and understanding those files. I think this is the best part of SystemD. There are other projects like S6 for example. I need to understand more about Fly's implementation of init, I am very curious.

tptacek · on May 14, 2021

It sounds like you're saying that it would be better if our init were literally a bash script. Is that what you're saying?

0xbadcafebee · on May 14, 2021

Not literally, no, as you need an init program that performs certain steps first which bash can't do. But once those few operations are done you can pass execution off to a script.

It looks like your init is running the gamut of typical linux init steps: handling signals, spawning processes, handling TTYs, running typical system start-up commands. Then it gathers host and network metrics, reports results, and operates as some kind of websocket server? All of that other stuff should live in a separate dedicated program that the init script runs. Unless I missed something and that's not possible?

tptacek · on May 14, 2021

We can't assume any particular environment inside a VM; we don't own the "distribution", or really anything other than init itself, including libraries. We could make something work, but it'd be more effort than what we do now, which is just some trivial straight-line Rust code, in an init that was going to be Rust anyways.

We wrote a little more about this here:

https://fly.io/blog/docker-without-docker/

0xbadcafebee · on May 14, 2021

Yeah, so you'd use an initrd as "your" environment, run your programs, pivot_root (or the modern equivalent) and then pass execution to your guest's environment, which can have its own init.

I say it will eventually be rewritten as a shell script (minus the stats, which will be replaced by a monitoring agent) because people messing with embedded Linux often try writing their own compiled init to bundle with an initrd. It eventually becomes a hassle, so they either make their own feature-filled init replacement, or they go back to a shell script. Most go for the shell script.

I mean, it's a DSL for composing programs. It's too powerful for what you need to do, but all its other advantages (portability, flexibility, simplicity, tracing, environment passing, universal language, rapid development, cheaper support, yadda yadda) make it a win over time. The only reason I can see not to use a shell script is if speed is your highest priority.

tptacek · on May 14, 2021

I like shell scripts too. I once wrote a shell script that converted X.509 certificates into shell scripts that generated the same X.509 certificate, field by field. Shell scripts are great. Not the right tool here, but, great.