The standard compressed formats don't literally contain a dictionary. The decomp...

yorwba · 2025-07-05T04:39:15 1751690355

> The dictionary needs to be downloaded too

Which is why the idea is to use a previous version of the same file, which you already have cached from a prior visit to the site. You pay the cost of decompressing without a dictionary, but only on the first visit. Basically it's a way to restore the benefits of caching for files that change often, but only a little bit each time.

zvr · 2025-07-06T19:51:51 1751831511

Of course, the Brotli default (built-in) dictionary is infamous for containing such strings like "Holy Roman Emperor", "Confederate States", "Dominican Republic", etc., due to the way it was created. One can see the whole dictionary in https://gist.github.com/duskwuff/8a75e1b5e5a06d768336c8c7c37....

Having a dictionary created by actual content to be compressed will end up with a very different dictionary.

pmarreck · 2025-07-07T16:45:09 1751906709

> The dictionary needs to be downloaded too, and you're not going to have dictionaries all the way down

We already have a way to manage this: Standardizing and versioning dictionaries for various media types (also with a checksum), and then just caching them locally forever, since they should be immutable by design.

To prevent an overgrowth of dictionaries with small differences, we could require each one to be an RFC.