Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

TBH, I'm finding that people underestimate the usefulness of CPU in both inference and fine tuning. PEFT with access to 64GB+ RAM and lots of cores can sometimes be cost effective.


I think engineers learn this quickly in high-scale/performance production environments. Even without hardware backgrounds. SLAs/costs create constraints you need to optimize against after promising the business line these magical models can enable that cool new feature for a million users.

Traditional AI/ML models (including smaller transformers) can definitely be optimized for mass scale/performance on cpu-optimized infrastructure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: