Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Scale AI wrote a paper a year ago comparing various models performance on benchmarks to performance on similar but held-out questions. Generally the closed source models performed better, and Mistral came out looking pretty badly: https://arxiv.org/pdf/2405.00332




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: