Devrel salaries appear to be growing very rapidly.
This was unheard of just a few years ago. It's possible it's because previously these roles would be called 'Head of Product Marketing,' and have just been rebranded as devrel.
The image has been fixed, and the point I'm making is that proprietary models are almost always ahead, and this gap is widening. OS models that are nearly at the same quality are usually distilled versions of proprietary models, or somehow get training data from them. Sometimes, after massive, expensive training runs models are open sourced anyway, and at some point that becomes unsustainable.
The difference between a top model and a model with a similar ELO might seem small, but the value of even a marginal increase in intelligence is extremely high--for example I only use the best coding model for coding, whatever the cost.
There's also lots of evidence that large labs are only getting started. In the past year, they have secured massive amounts of compute, which is still not utilized well. I expect lots of big training runs in the future, which will shift the gap further between OS and proprietary models.
The major problem for these companies is they spend hundreds of millions of dollars training a model, and then someone comes in the next day and distills something almost as good for far less money (still a VERY large sum of money.)
Note that distilling a general model is several orders of magnitude more expensive than distilling a task-specific model, which is what I'm trying to promote here. Smart general models make distilling great task specific models with no expert labelers way easier.
I think I'm getting it now: OS models are getting closer, but only via distillation. Not by training a new frontier model which is out of reach for economic reasons.
This was unheard of just a few years ago. It's possible it's because previously these roles would be called 'Head of Product Marketing,' and have just been rebranded as devrel.