Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
trescenzi
on Dec 21, 2024
|
parent
|
context
|
favorite
| on:
OpenAI O3 breakthrough high score on ARC-AGI-PUB
How many guesses is the human comparison based on? I’d hope two as well but haven’t seen this anywhere so now I’m curious.
nmca
on Dec 21, 2024
[–]
The real turker studies, resulting in the ~70% number, are scored correctly I believe. Higher numbers are just speculated human performance as far as I’m aware.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: