Is the 2000TB/day number reasonable, or just clickbait?
It's all based on coming up with two numbers that when multiplied say that this resource is downloaded 73 billion times a day. That's 20 times / day for every internet user (even those using it on a 2G connection once a year). Given a reasonable caching period of 1 month, that'd mean the average user visits 600 different sites a month. That seems like much more than one would expect.
Or to look at it another way, global IP traffic is apparently estimated at 72000 PB/month. That's 2400PB/day. This font alone would then be accounting for 0.3% of all internet traffic, or 1% of all non-video traffic. Again, that's a very high number.
But at least it doesn't appear to be a totally impossible number either, just an awfully implausible one.
That's how you can tell flat isn't just a trend - it's going back to what worked, after years of doing rounded corners and drop shadows because we can and never stopping to wonder whether we should. There's a lost decade in web design, but hopefully we're now starting to recover from it.
ps. I can't use any of your examples. I use a portrait orientated screen which causes the examples list to go below the example. Hovering over them causes the example to change, which changes the height of the area, which forces a hover event on another item in the list as it gets pushed up/down which repeats the process until the height is either too short or too tall for me to be hovering an item in the list.
Consider making it a .onClick() event after a certain screen size? :)
E: Woah! Just realized you're the same kalid behind betterexplained. Another really awesome resource. :)
Great feedback, thanks! The examples should probably load on either hover (preview) or click (load). Currently it takes you off the example list and to a standalone calc, which is confusing. (There are so many usability issues you discover as things get out there.)
It's awesome to hear when it's been a long-standing tool in the utility belt. The original version had charts/graphs (simple pie charts, etc.) which I need to bring back, were you considering more Wolfram-Alpha style 3d plots?
Thanks for the feedback. The colors are more syntax highlighting (vs indicating anything important), I'd like to research best practices here. I wonder if there's any client-side settings I can detect to indicate a different stylesheet should load.
Thanks! I'm trying to make the imgur/pastebin of calcs. Whip something up on a forum (ahem) and share it in seconds. Let people try and share their own scenarios.
I can put your mind at ease: I took a bit of poetic-mathematic license with that number. I figured that the big sites and the small sites together would average a perfect 1000 fresh-cache vistors a day, but in all probability I'm off by at least 3 or 291 visitors :)
I think you misunderstand the effect of the long tail. For that many sites the average is going to be tiny. And where did the figure of 73 million come from?
73 million sites comes from our friends at Meanpath (their search engine isn't running anymore, sadly). They showed us on 7.3% of websites. Best guess is there are about a billion sites on the internet. Thus, 73 million.
Where did the 1000/visitors/site/day come from? This is what seems unlikely and improbable. This resource is being cached as well, so these numbers do seem obtuse.
Not sure on the 1000/visitors/site/day. But we're also on some pretty big sites as well. IIRC, the final number there is higher than what MaxCDN (BootstrapCDN folks) serves. Not by much though. Within an order of magnitude I think (which is pretty accurate for this kind of guesswork).
Of those 73 million, how many of them are serving the file themselves, vs pointing to it hosted by someone else? I know you can just point to google directly for their fonts, or unpkg.com for js packages. Is that something people also do for font awesome? Those shared links would greatly increase the cache level, and reduce the number of cache misses, and amount of data served.
According to Netcraft [0] there are currently 171,432,273 active websites. However there were 1,436,724,046 host names on the net. A total of 6,225,374 web-facing computers was reported.
Interestingly the number of active sites hasn't materially changed in 5 years, but host names have skyrocketed.
> If each day 73 million websites serve the Font Awesome WOFF to an average of a thousand visitors who don’t have these fonts in the browser’s cache...
This assumption that they don't have the fonts cached is only sensible if each user is unique. But that means that we have 73 BILLION unique users.
Not necessarily: your browser caches a different version of font-awesome for each website you visit. There's no way for it to tell that site1.com/static/font-awesome is the same as site1.com/resources/font-awesome until it makes the HTTP request and downloads it.
This is a great article. Some of these are already on the agenda for FA5, but there's some new stuff there for us too. We'll dig in and it's a TODO on the FA5 roadmap now. :)
Another thing I'm super excited about is stray points in vectors. We found some new tools for Illustrator that make this a LOT easier and will have a very real impact on bandwidth as well.
(Oh and FA5 Pro CDN will allow loading just the icon categories you need. And for real granular control, FA5 Pro includes a desktop subsetter for all backers too.)
Hey, glad you liked it! I tried to get in touch but I guess the message got lost in your noisy inboxes — must be a busy time for you folks. Ping me if there's anything I can help with!
I like and use font-awesome. Thanks and keep up the awesome work. Are there advantages to Wenting Zhang's [approach](http://cssicon.space)? Is there any way of lazy-loading - i.e. just accepting that some people - either through ignorance or laziness will load the entire set - and only use a couple of icons? Even if this were just some tool to craft the url it might get us on the road to that petabytes per day (really?).
There are some really valid points in here and I dislike the idea of using the whole font when only a few icons are required.
But, isn't subsetting going to result in users now caching your subset instead of a cached copy of everything? I would think that does more harm than everyone grabbing a fully cached copy once from a cdn.
It's one of those things that works "in a perfect world", but in the real world it just doesn't work out that well.
For starters, leveraging caching via a common CDN pretty much requires everyone to be using a single version from a single CDN. If you can't agree on that, then every time a new version comes out the web is split and the caching doesn't work, and every time someone decides to use another CDN (or someone provides a new one) the group is split again.
But then split that across all the fonts, formats, and compression schemes available and you'll see that the chance that a visitor has seen that font, at that version, from that CDN, using that compression scheme, in that format at any point in the past EVER is actually significantly smaller than you'd think.
Which brings us into the next point. Even if you've seen it before, the chances that you'll have it cached is pretty small. Browser caches are suprisingly small in the grand scheme of things, and people tend to clear them more often than you think. Add in privacy browser mode and "PC cleaner" programs and the average person's caches lasts much shorter than at least I expected it to.
But even worse are mobile caches. IIRC older android had something like a 4MB cache!!! And until very recently safari had something like a 50mb limit (and before that didn't cache ANYTHING to disk!). Now it's better, but you are still looking at a few hundred MB of cache. And with images getting bigger, big GIFs being common, huge amounts of AJAX requests happening all the time in most web pages, you'll find that the browser cache is completely cycled through on a scale of days or hours not weeks or months.
IMO it's at the point where the "dream" of using a CDN and having a large percentage of your users already have the item in their cache isn't going to work out, and you are better off bundling stuff yourself and doing something like "dead code elimination" to get rid of anything you don't use. And that method only becomes more powerful when you start looking at custom caching and updating solutions. A few months ago I saw a library that was designed to only download a delta of an asset and store it in localstorage so updates to the application code only need to download what changed and not the whole thing again. Sadly I can't seem to find it again.
> For starters, leveraging caching via a common CDN pretty much requires everyone to be using a single version from a single CDN. If you can't agree on that, then every time a new version comes out the web is split and the caching doesn't work, and every time someone decides to use another CDN (or someone provides a new one) the group is split again.
All this common web stuff that is distributed by several CDNs (as well as separately by individuals) really suggests to me that there should some browser feature like `<script src="jquery.min.js" sha256="85556761a8800d14ced8fcd41a6b8b26bf012d44a318866c0d81a62092efd9bf" />` that would allow the browser to treat copies of the file from different CDNs as the same. (This would nicely eliminate most of the privacy concerns with third-party jQuery CDNs as well.)
Because anything that can cross domains instantly allows anyone to probe your browser to see what is in your cache.
So to take it to a bit of a rediculous (but still possible) point, I could probably guess what your HN user-page looks like to you. So from there I could serve that in an AJAX request to all my visitors with this content-based hash and if I get a hit from someone, I can be pretty damn sure it's you.
And that only really solves one or 2 of those issues. The versioning, compression schemes, formats, number of fonts, and sizes of browser caches will still cause this system's cache to be a revolving door, just slightly more effective.
And as for the security concerns of using a CDN. Subresource-integrity (which someone else here linked already) allows you (you being the person adding the <script> tag to your site) to say what the hash of the file you expect is, and browsers won't execute it if it doesn't match. So that lets you include 3rd party resources without fear that they will be tampered with.
Ideally this would be used with a sort of `Cache-Global: true` header in HTTP, and then you would only be able to grab things that are intended to be cached like this. It would do nothing to stop super-cookies with this method though.
Security hole: This could leak hash preimages that the user has in cache but are sensitive.
Solution: Using a sha256="..." attribute should only allow you to access files that were initially loaded with a tag that has a sha256 attribute, and this attribute is only used for resources the developer considers public.
Was about to reply with exactly that information, but as it turns out apparently doing content addressable caching via the SRI mechanism has some problems and maybe is not possible:
Yeah have thought about this a few times myself. Maybe missing something that makes it impossible/risky? Or maybe its just the tendency to ignore simple solutions.
This not only solves the CDN issue but it also solve the issue of having to rename the files manually everytime someone do a change. It just makes caching that much saner.
It can be used to subvert the same origin policy and content security policy.
If you see a script tag with the URL bank.com/evil.js, the browser shouldn't assume that the bank is actually hosting evil.js. Even if the hash matches, the content might not be there.
The bank might be using a content security policy to minimize the damage that an XSS attack can do. It only allows script tags from the same origin. However, now an attacker just needs to load evil.js with a particular hash into the cache, and they can create the illusion that the site is hosting it, without having to hack the server to do so.
Awesome description of the fragmentation and browser cache size problems that prevent these shared CDNs for common JS/CSS/Whatever files from providing optimal benefits the vast majority of the time. The challege is that oeople still say "well, even if it only works some of the time, that's OK."
It's not. Because does this hurts page load times.
You are having to create a new, cold, TCP connection to go fetch 50-100KB of CSS/JS/whatever from some random server. Which even in HTTP/1.1 is usually slower than just bundling that into your own CSS/JS/Whatever. HTTP/2 makes it even more so.
These things highlight how the current system of font distribution is really suboptimal. Even CDN hits are metered, and the idea that I need to either load or cache a bunch of data to render text is dumb.
My employer manages like $20k devices. I betcha we spend 5 figures annually on this crap.
installing a ton of fonts up front takes a pretty significant amount of space, installing a subset for their language/preference or letting the user manage it makes it VERY easy to fingerprint users based on what fonts they download, and doing any kind of cross-origin long-term caching is a security nightmare as it lets you begin to map out where a user has been just based on what they download.
Fonts are a pretty important factor in design. Most people may not explicitly notice it, but it certainly affects the impression they get from a website.
You could compare it with http/2: If you do a survey, you won't find many people even knowing it. That doesn't mean it's useless to them.
> Most people may not explicitly notice it, but it certainly affects the impression they get from a website.
Most people already have attractive, readable fonts installed on their computers, which are likely either sensible defaults, or configured for specific reasons (e.g. minimum size to deal with eyesight). Web pages that render as blank white space for awhile, or briefly tease me with text before blanking it out, give me a much more negative impression than ones that simply render in my chosen default fonts.
This is an interesting comparison because web fonts have the opposite effect of HTTP/2: They introduce a huge delay between clicking a link and being allowed to actually read anything.
On 3G or shaky Wi-Fi, I've regularly given up on browser tabs because all I see is an empty page, even after half a minute of loading and when most images have finished downloading. (Maybe other browsers are better than Safari, but I won't switch just to see prettier fonts.)
If old mobile browsers have 4MB caches, 160+k of that is a big chunk. If you could reduce it to e.g. 10k, you'd need 16 sites all using FA with different font selections before you equal the original size. There's a reasonable chance that it's an investigation worth doing.
Another option: find the most-used icons or combinations. Group them.
Another option: similar to nandhp, get a hash of the font selection and name the file so. There's a very good chance a nearby proxy has that combination stored already.
It's not about disk, it's about network I/O. Making your resource "more unique" means more cache misses and more requests that need to be served (in theory anyway, see Klathmon's sibling comment[0] for more on this).
Not being a fan of used shared CDNs for static resources[1], I don't see the issue here. They're going to have to download something anyway so from the perspective of your users it's still one download, just smaller. With proper unique namespacing (unique URL per version) and HTTP Cache-Control headers, they only have to download it once (assuming they don't clear their local cache).
[1]: Combination of security reasons and unnecessarily coupling apps to the public internet.
What an interesting thought. I wonder what the actual user base of a library would have to be before it would even itself out and then go over the threshold of where you would see a return. Certainly 74 million sites should do it if they were all using the same CDN but I have no idea how you would start to even try to calculate this.
Unfortunately it's the same with Bootstrap and other JS/CSS frameworks/libraries like that. You typically only use a small subset but it is non-trivial to carve out the much smaller set that you need. There is some tooling that claims to attempt at cleanup, but not sure how tested they are.
Even if you subset is the same as mine, if each of us is doing the subsetting ourselves, we won't share the same URL, hence the browser will still fetch it twice.
On the other hand, it seems that FA themselves are building a CDN with subsetting, so they could in fact provide those shared subsets. Unfortunately (but understandably) it's paid, so most of us can't use it.
If you're only using a small subset of FontAwesome (as I suspect many people do), I'd imagine at some point it'd make a lot of sense to use data-uri's to effectively embed them directly in an existing request, which would be faster than a CDN?
I saw a talk by someone from Smashing Magazine where they basically did this for a subset of their WebFonts (downloading the entire font asynchronously afterward), then they cached their WebFonts in LocalStorage, but it seems like it make even more sense for an icon font where you're using a very small subset.
When reducing SVGs for data urls, I've found you can reduce size even further by scaling the image up, then rounding coordinates to integers. I've written a few scripts to assist with this, but I imagine you could get a similar effect in Inkscape by snapping everything to an integer grid.
And I'm sure there's further room for manual reduction, maybe by switching between strokes/fills and further cleaning up the paths.
(It would be cool if someone were to work on an automated system for doing this kind of simplification. But I have a feeling that it's difficult find a general technique that will work across SVG hierarchies, like scaled/rotated groups, etc.)
For my blog, I embedded all my webfonts into my CSS using base64 data URIs. That CSS is 100k, but my server gzips it to 72k, which is not much bigger than the CSS without the fonts + the fonts separately. Because I don't have HTTP2 yet, fewer requests make it faster (especially considering HTTPS), despite the extra size.
101K 9 Dec 18:30 with.css
99K 9 Dec 18:31 with.min.css
73K 9 Dec 18:31 with.min.css.gz
6.8K 9 Dec 18:30 without.css
5.2K 9 Dec 18:31 without.min.css
1.8K 9 Dec 18:31 without.min.css.gz
Although this might be true:
> …72k, which is not much bigger than the CSS without the fonts + the fonts separately.
The issue here is that CSS is a render-blocking resource and fonts are not; ergo you’ve increased render-blocking assets from 1.8K to 73K, meaning the user has to download 4056% more bytes before they get to see anything at all. Base64 encoding assets is usually a huge anti-pattern. You’d be better off moving those fonts back out into separate requests.
I hope this comment is taken in the spirit it’s intended: I like making the internet fast :)
Have a great weekend!
(If you’re interested in further reading, look up anything to do with the critical rendering path.)
One advantage of embedding the fonts is that it eliminates FOUT. If the fonts were split out, once the fonts were loaded, the page would need to be re-rendered anyway. I don't see much practicality of rendering an already light page if it's going to be re-rendered differently quickly after.
By embedding the fonts, I've optimized for latency in downloading fewer files over HTTP1. Even a 2k CSS and 20k-ish font are hardly worth the cost in opening the connections; a 73k all-in-one is more efficent. Let's assume a reasonable 8mbit connection and 50ms RTT.
To download the 72k CSS on a new connection (because it's early in the page load), that takes 100ms to establish TCP and HTTPS handshake, and 122ms to download the 72k embedded font CSS, total: 222ms (at least, not including processing overhead).
For the split version: 100ms for handshakes, and 52ms for the 2k CSS transfer. Here, the browser might keep it open to download one (the headline font) (negligible load time for a open connection), and open another connection for the body text font. (the third monospaced font is rarely used.) The open one would download one 23k font in 73ms. The other connection (in parallel) will take 100ms for handshake, and 73ms for another 23k font. total: 325ms (at least, not including processing overhead).
For higher bandwidth connections, bigger files make even more sense. For split files to truly win, latency will need to be impossibly low, which for the hosting from my apartment basement (even on 150mbit fiber), is impossible.
Even though it's render blocking, browsers will continue loading other assets anyway. A mostly (if not fully complete) page will show the first time, often within a second on wired connections.
> I hope this comment is taken in the spirit it’s intended: I like making the internet fast :)
Hah, this is getting interesting. Would be more fun doing this over a beer, but…
> Let's assume a reasonable 8mbit connection and 50ms RTT.
This is a huge assumption; you’ll be neglecting almost all of your mobile users there. But! Optimise for your use-case, of course.
If you’re self hosting your fonts I would leave them external, `preload` them, and use the preferred `font-display` mechanism to deal with (or elimiante) FOUT and/or FOIT. That’s your extra connections taken care of, and we’ve moved fonts back off of the critical path :)
I think what we’ve both hit upon here is the fact that browsers need to decide on a standard way of handling FOUT/FOIT. Leaving pages blank for up to 3s (or in Safari’s case, indefinitely) would completely eliminate the need to try and circumvent it by base64 inlining :(
Wouldn't the most obvious solution be to open-source FA fonts and include them in Linux, Windows, Mac OS, iOS and Android? If they were installed on all systems then we wouldn't really have this issue at all. Given the ubiquity of some fonts this doesn't seem impossible.
Or and sorry to say cause I like FA, screw them and see if Apple/MS/Google can have an open standard icon font that doesn't suck.
Doesn't have to be in the OS. Enough to bundle it with the browsers. Same goes for all major frameworks as well, just ship jquery, angular, bootstrap and react with the browser already, like a super long lived cache.
If that's too steep for you, maybe a browser extension would be a good middle ground.
I've been using https://icomoon.io/app/ for years now to pick only the fonts I like and it can also generate icon fonts from svg so that I can merge different icon sets and pick the only icons I will use in my projects.
This is definitely my go-to for slimming down huge icon packs. Font pack companies really shouldn't focus on that since there are so many tools that help developers do this as well.
That's a lot a bandwidth for a single thing. Makes me wonder, perhaps we should have a global repository of popular web libraries for browsers, all of them versioned, like happens on npm/bower. It's possible be backwards compatible with old browsers using a real and standardized URL. A single source of the truth. It's better than many CDNs providing the same thing over and over again... Being immutable packages browsers do not need check for updates (no more waste with 304 responses).
Unfortunately I think a lot of the points in this article are lost if you assume that the CDN's serving up the TTF for FontAwesome are using GZIP compression.
Opening the link you provided above with Firefox and bringing up network motior, I get the same as you; 109.18 KB "transferred", 131.65 KB "size", and gzip was used according to headers. I thought that maybe "transferred" might be the size of the data after decompression, and "size" would be "transferred" plus headers, but then I checked Chromium and I see 110 KB there as well.
It seems very strange that the size reported by the parent is varying. Why would it do that? Are they all the same file or different ones?
- Font rendering is very inconsistent across platforms, so your icons can look pretty bad (e.g. on Windows)
- You're more likely to hit aligment/layout issues because you're using a tool designed for text to render pictures
- The character codes used are textual information but have no semantic meaning (this can be addressed by using ligatures or characters in the private unicode range)
- Adding more icons requires regenerating the font
- Using a subset of icons requires regenerating the font
- Font authoring is non-trivial and requires much more specialized tools and knowledge when compared to SVG
- You need to ship the font in various formats (TTF, OTF, WOFF, WOFF2)
- It's not easy to animate icons, or include colors or transparency
- By now SVG support is sufficiently widespread, so the original rationale for using font icons no longer applies
None of these, except maybe the rendering quality, are deal breakers. Workarounds and counterarguments do exist. And SVG does have some disadvantages too (e.g. parsing overhead, filesize, memory consumption)
I personally side with using SVGs currently. As technology progresses, I might definitely reassess my view.
Yes 100% put your icons in an include file of svg defs.
There are a number of tools that will optimize svg files for size like svgo. Output straight from Illustrator, for example, has a lot room for improvement. I also usually end up hand-tweaking for things like removing the full style and doing that in my site's CSS.
What I think is really still needed are more options besides Illustrator for generating svg content. Yes, Inkscape is overall ok but is lacking in setup ease and genera speed.
There a number of programs that are so close to doing what is needed and then won't save or export an svg. Pixelmater, looking at you.
Are those the same people that disable JavaScript everywhere too? Imagine that you were running desktop apps and said "ok I want this program, but I only want it to use standard widgets, my chosen font, and not use certain API's. Wait why does this look funny and not work right?!"
Actually not. I allow Javascript but I block web fonts because I think they're a stupid idea.
Your argument about desktop apps isn't very good because people can (and do) prevent desktop apps from doing certain things and accessing various stuff, like web cams, the microphone, system directories, the internet, etc. If my text editor breaks because it can't access my webcam, I'd say it's a serious problem.
It seems worth noting that as web browsers have removed the ability to disable JavaScript they have added options to disable web fonts. I flipped this switch in Firefox for Android because it makes pages load noticeably faster and conserves data. Lots of sites seem to host web fonts on third-party CDNs which require a separate TCP connection: a big deal on high-latency mobile.
Blocking web fonts usually does not noticeably degrade a site's usability or functionality. Icon fonts are the exception. I've learned that a question mark in the upper-right corner is usually a site's menu, except when it's the search function. I don't think that this level of breakage is comparable to what happens when JavaScript is disabled, since many sites just break completely without scripting.
Stuff like this is like rearranging chairs on the titanic - 26 kilobytes is a drop in the bucket compared to the bloat created by all the other shit people slap on to their sites - ad scripts for example
That is just a terrible analogy. Saving 20% of bandwidth, if you really want to make a accurate analogy to an event when 1500 people died, would be like saving the lives of 300 people.
Saying "Screw it, More people died in other places" is missing the point. The article even admits this is not the #1 place to optimize... but that doesn't mean you should scoff at it.
I don't think this is a platform-specific thing. It's an 'artsy' font that takes away from the reading experience. While it might be fine for graphic design, it certainly doesn't seem like one that's suited for big blocks of text.
>How Font Awesome 5 Became Kickstarter’s Most Funded Software Project
I checked their kickstarter, however.
>35,550 backers pledged $1,076,960 to help bring this project to life.
Pillars of Eternity, most assuredly software, was a kickstarter I backed. Their campaign page reads to this day:
>73,986 backers pledged $3,986,929 to help bring this project to life.
Why make up such a shitty lie? Does it matter if you have the longest John in the pub? Is this something USA specific down the American Dream/Meritocracy/Competitiveness axis? I find most European projects to be more modest about these things.
No hate please! None intended here. It just feels weird to me.
The fork commits only the modified font files, so as soon as the upstream project modifies the font files it will be out of date. Might it not be better to script the optimization passes?
You can also serve the font files with the `Cache-Control: immutable` HTTP response header so the client never needs to revalidate with the server (even when the user forces a page reload). Use versioned filenames for immutable resources if you need to change them later.
We are actually in the process of changing our Font Awesome font usage into svg versions of the font (via Icomoon's free converter tool) due to some of our customers blocking web fonts on IE11 via Windows Registry settings, claiming security issues...
But one nice side-effect of changing it to svg versions is that our clients are now loading icons on demand and only the ones being used in the app, as opposed to the whole web font.
Despite all the hand waiving about accuracy of the claims or the efficacy of CDN fronted caching, etc etc, it is good to see more initiatives around saving (especially mobile) bandwidth use. After moving from EE into full stack js I've been blown away as the duplication present
The FOSS web community should participate in a global feature freeze and stop creating new libraries and frameworks for a year or two to work on stuff like this instead. Just so much opportunity everywhere.
It's all based on coming up with two numbers that when multiplied say that this resource is downloaded 73 billion times a day. That's 20 times / day for every internet user (even those using it on a 2G connection once a year). Given a reasonable caching period of 1 month, that'd mean the average user visits 600 different sites a month. That seems like much more than one would expect.
Or to look at it another way, global IP traffic is apparently estimated at 72000 PB/month. That's 2400PB/day. This font alone would then be accounting for 0.3% of all internet traffic, or 1% of all non-video traffic. Again, that's a very high number.
But at least it doesn't appear to be a totally impossible number either, just an awfully implausible one.