Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> They will hire anyone who can produce a model better than GPT5, which is the bar for fine tuning

Depends on what you want to achieve, of course, but I see fine-tuning at the current point in time primarily as a cost-saving measure: Transfer GPT5-levels of skill onto a smaller model, where inference is then faster/cheaper to run. This of course slows down your innovation cycle, which is why generally this is imo not advisable.



I agree this is the main case where it makes sense.

But a recent trend that cut into the cost savings is that foundation model companies have started releasing small models. So you can build a use case with qwen 235B, then shrink down to 30B, or even all the way down to 0.6B if you really want to.

The smaller models lose some accuracy, but some use cases are solvable even by these smaller and much more efficient models.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: