Yet another reason why I absolutely love Postgres. It might have taken four years, but we finally got what many of us have been asking for. Nothing ruined my day more than having to write insert-update loops, I am beyond ecstatic for this.
Because the devs pay attention, and take the time to do things right. See https://wiki.postgresql.org/wiki/UPSERT for a glimpse into the design that needed to go behind this.
I feel a lot of Postgres fans (myself included) put their "money" on Postgres circa late version 7 or early version 8, back when MySQL was the more featureful and performant of the two, while Postgres had the reputation for being more, shall we say, robust. (Remember, those were the days before InnoDB was the default in MySQL.)
The payout for investment in Postgres the past few years has been substantial -- native replication, true serializable transactions, foreign data wrappers, index-only scans, native JSON support, updatable views, and materialized views are among the features added to Postgres in the last 4 years. It's matured from being an "entry-level" RDBMS with few features that all work reliably, to a much more enterprise-friendly RDBMS with many features that still work reliably.
> late version 7 or early version 8, back when MySQL was the more featureful and performant of the two
MySQL was never more featureful. The reason I started migrating back in the 7.1 time frame (when TOAST tables were added and you could finally store more than 8K of text in a TEXT column) was the lack of sub-selects in MySQL.
Even aside of that, Postgres was far ahead when considering basic SQL support: stored procedures, views, subselects, check constraints, triggers, actually enforcing foreign key constraints and so on.
It was significantly slower than MySQL, but it also scaled much better under load. Back then, when you had low load, MySQL would be about twice as fast as PostgreSQL but then as the load increases, MySQL's performance would drop sharply and Postgres would stay consistent.
By now, MySQL has mostly caught up feature-wise, but there's still stuff left that Postgres just does better. Also, even plain ideological reasons (community project vs. oracle open-core project) would want me to stay with postgres.
I also have anecdotal evidence that MySQL still has serious issues in the robustness department which I've yet to see with postgres.
MariaDB, WebScaleSQL, Percona are just three MySQL forks that are completely open source in every single way. Likewise MySQL-Server is GPL so it definitely should be considered as open source.
And if MySQL wasn't robust then YouTube, Facebook, Twitter, Alibaba, LinkedIn etc wouldn't be using it for core parts of their infrastructure. It's definitely robust.
No doubt that PostgreSQL is better at MySQL in many areas though and probably could do with a self contained, single download PostgreSQL Cluster edition.
>And if MySQL wasn't robust then YouTube, Facebook, Twitter, Alibaba, LinkedIn etc wouldn't be using it for core parts of their infrastructure. It's definitely robust.
Either that or they have staff and infrastructure in place to deal with the lack of robustness. Yes, you need to be prepared to deal with corruption anyways, but the more robust your solution, the more time is left to deal with other things.
Over the years I have seen multiple instances of MySQL table corruption, index corruption and mysqldump exiting with a zero exit code after aborting mid-dump due to table corruption.
Of course I was prepared for this and I always had backups ready, but it was still time-consuming and annoying.
With Postgres I've yet to see any kind of data corruption even though my postgres usage is much heavier than my MySQL usage.
But this is why I said "anecdotal evidence": For me personally, Postgres has proven to be way more robust than MySQL. Is this my inability to properly administer MySQL? Is it me being unlucky with hardware (though Postgres is fine on the same hardware)? Is it me just being unlucky? I really don't know.
> And if MySQL wasn't robust then YouTube, Facebook, Twitter, Alibaba, LinkedIn etc wouldn't be using it for core parts of their infrastructure. It's definitely robust.
Popularity is not an argument for quality. See: crocs, Justin Bieber, PHP.
Those companies you mentioned, like many others, probably use it because they're locked in that technology, not because it's a superior one. Just like banks still use COBOL.
As an enthusiastic wearer of Crocs I have to interject to point out that there are many different aspects to quality - being robust is just one of them. Being able to easily recruit experts, ease of use, availability and quality of tools in the ecosystem are other measures.
I never said popularity was an argument for quality. You did.
What I am saying is that if MySQL wasn't robust then those companies simply wouldn't be using it. Since at their scale any bug or weakness will manifest at a level far greater than say at a startup. And they have the skills, time and money to choose any technology they wan't so I don't buy your argument that they are "locked in". Some like Facebook and LinkedIn have even created their own databases.
And I never said MySQL was a good idea. You did. I am saying that to claim it is not robust flies in the face of available evidence.
" never said popularity was an argument for quality. You did."
I'd tend to think your statement:
"And if MySQL wasn't robust then YouTube, Facebook, Twitter, Alibaba, LinkedIn etc wouldn't be using it for core parts of their infrastructure. It's definitely robust."
..points out that many (popular) sites use it, and a prerequisite is robustness, which thesaurus-wise, sounds a lot like quality.
I get that
a) you didn't actually say it, and
b) you could mean something much more specific, such as "companies with many highly-starred, complex open source projects which have also re-written major parts of their tech stack, but chose to leave mysql in place".
I know thats a bit more verbose, but it just seems to obviously close to the other author's interpretation with the ambiguity of the statement.
>What I am saying is that if MySQL wasn't robust then those companies simply wouldn't be using it.
That conclusion does not follow from the premises: that some well known large companies use a technology in no way implies that it is robust.
You are assuming that large, well known companies always intrinsically select for high robustness in their tools, which is not necessarily the case. In fact, there are many well known public cases of the opposite.
It could be (for example) that large companies simply have the spare organizational capacity to deal with a lack of robustness. Perhaps they are locked in and find the ongoing cost of dealing with lack of robustness to be lower than the costs of switching to something more robust. Or it could be that they don't notice, for whatever reason, the lack of robustness. There could be many other possible explanations.
> And I never said MySQL was a good idea. You did. I am saying that to claim it is not robust flies in the face of available evidence.
I am glad you mention the word "evidence", since you have provided none so far. An assorted selection of popular, buzz-worthy sites does not qualify as evidence (or lack thereof) of the validity of the technologies they use and endorse.
For contrast, I linked to a really nice post that explains thoroughly why MySQL is not a good idea in the slightest. I would encourage you to read it.
Replication? Replication? Replication?
UPSERTS?
Also, MySQL supports multiple engines, so when talking about MySQL it makes sense which engine you have in mind.
Multiple engines in the MySQL case are an anti-feature, from the MyISAM v InnoDB perspective. Far too often the wrong or arbitrary choices are made, mixing within the same database, causing unnecessary confusion and work. "That table is on this storage engine which does not support the operation we need."
I can see the case where a different storage engine supports a different data format or use case (columnar storage for example), but in an RDBMS case I would argue consistency outweighs convenience.
The context of my comment and the comment it responded to was the ~7.1 time-frame which was in 2000. My "never more featureful" comment was related to what was released in 2000.
Back then, there was just MyISAM (and maybe even still ISAM), no transactions, no replication, no upserts. Back then there was nothing that MySQL could do that Postgres couldn't.
If given the choice between do it half-assed but quickly and do it right but using how ever much time is required to arrive there, the PostgreSQL team always choses the latter approach.
This conservatism works well for me considering we're talking about a database here.
One of the contributors has written an interesting article explaining why upsert is difficult to get right (if by "right" you want it to complete reasonably quickly and without any chance of corrupting your data):