The Hitchhiker's Guide to the Galaxy defines the marketing division of the Sirius Cybernetics Corporation as "a bunch of mindless jerks who'll be the first against the wall when the revolution comes,"
Clickhouse has a wide range of really interesting technologies that are not in Postgres; fundamentally, it's not an OLTP database like Postgres but more-so aimed at OLAP workloads. I really appreciate Clickhouse's focus on performance and quite a bit of work goes into optimizing the memory allocation and operations among different data types.
The heart of Clickhouse are these table engines (they don't exist in Postgres) https://clickhouse.com/docs/engines/table-engines . The primary column (or columns) is ordered in some way and adjacent values in memory are from the same column in the table. Index entries span wide areas (EG: By default there's only one key record in the primary index for every 8192 rows) because most operations in Clickhouse are aggregate in nature. Inserts are also expected to be in bulk (They are initially a new physical part that is later merged into the main table structure). A single DELETE is an ALTER TABLE operation in the MergeTree engine. :)
This structure allows it to literally crunch billions of values per second (brutally, not with pre-processing, erm,
"tricks" although there is a lot of support for that in Clickhouse as well). I've had tables with hundreds of columns and 100+ billion rows that are nearly as performant as a million row table if I can structure the query to work with the table's physical ordering.
Clickhouse recommends not using nullable fields because of the performance implications (it requires storing a bit somewhere for each value). That's how much they care about perf and how close to the raw data type it is that their memory allocation uses. :)
> Inserts are also expected to be in bulk (They are initially a new physical part that is later merged into the main table structure). A single DELETE is an ALTER TABLE operation in the MergeTree engine.
> They are initially a new physical part that is later merged into the main table structure
The reason I mentioned it is because it's a huge surprise to some people that... from the docs: "The ALTER TABLE prefix makes this syntax different from most other systems supporting SQL. It is intended to signify that unlike similar queries in OLTP databases this is a heavy operation not designed for frequent use. ALTER TABLE is considered a heavyweight operation that requires the underlying data to be merged before it is deleted."
There's also a "lightweight delete" available in many circumstances https://clickhouse.com/docs/sql-reference/statements/delete. Something really nice about the ClickHouse docs is that they devote quite a bit of text to describing the design and performance implications of using an operation. It reiterates the focus on performance that is pervasive across the product.
This is one of the most interesting interviews I've ever read/listened to. Reminds me of when I first heard a Lex Fridman interview (the style is completely different but it hits on a lot of material that is interesting purely due to the openness of the interviewee to talk about whatever and how the interviewer drives the conversation).
If you are at all interested in the current challenges being grappled on in this space, this does a great job of illuminating some of them. Many many interesting passages in here and the text transcript has links to relevant papers when their topics are brought up. Really like that aspect and would love to see that done a lot more often.
I too wish deprecation with migration path was a more common pattern in today's language development. The language has very much needed work and the numerous bugs within Apple's own libraries certainly hasn't helped.
That said, some of the, erm, "new ways" to solve problems have been significant advancements. EG: Async/Await was a huge improvement over Combine for a wide variety of scenarios.
This is a very well done attack. Enjoyed reading about your efforts to gain community credibility. You rapidly transformed this from a small number of victims into an epidemic.
I'm surprised that VSCode extensions don't have a permissions system (EG:
"Request network access").
Yeah, I've written a few of these and should probably release a package at some point but each version has been somewhat domain specific.
The last time we measured an immediate 99% performance improvement over SNS+SQS. It was so dramatic that we were able to reduce job resources simply due to the queue implementation change.
There's a lot of useful and almost trivial features you can throw in as well. SQS hasn't changed much in a long time.