I think the prompt is probably at fault here. You can use LLMs for object segmen... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		RobertDeNiro 32 days ago \| parent \| context \| favorite \| on: Benchmarking leading AI agents against Google reCA... I think the prompt is probably at fault here. You can use LLMs for object segmentation and they do fairly well, less than 1% seems too low.

mdahardy 32 days ago [–]

The cross-tile challenges were quite robust - every model struggled with them, and we tried with several iterations of the prompt. I'm sure you could improve with specialized systems, but the models out-of-the-box definitely struggle with segmentation

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact