Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>when we get a prompt working reliably on one model, we often have trouble porting it to another LLM

I saw a study where a prompt massively boosted one model's performance on a task, but significantly reduced another popular model's performance on the same task.



Do you have any pointer to search for that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: