Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The SWE bench is super impressive of model of any size. However just providing one benchmark results and having to do partnership with OpenHands seems like they focused too much on optimizing the number.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: