How many guesses is the human comparison based on? I’d hope two as well but have... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		trescenzi on Dec 21, 2024 \| parent \| context \| favorite \| on: OpenAI O3 breakthrough high score on ARC-AGI-PUB How many guesses is the human comparison based on? I’d hope two as well but haven’t seen this anywhere so now I’m curious.

nmca on Dec 21, 2024 [–]

The real turker studies, resulting in the ~70% number, are scored correctly I believe. Higher numbers are just speculated human performance as far as I’m aware.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact