Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

the big step was having it reason through math problems that weren't in the training data. even now with web search it doesn't need every article in the training data to do useful things with it.


This is using think time compute and reinforcement learning. I think this is going to plateau even faster than the initial LLM scaling though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: