Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

    People defer thinking about what correct and incorrect actually
    looks like for a whole wide scope of scenarios and instead choose
    to discover through trial and error.
LLMs are _still_ terrible at deriving even the simplest of logical entailment. I've had the latest and greatest Claude and GPT derive 'B instead of '(not B) from '(and A (not B)) when 'A and 'B are anything but the simplest of English sentences.

I shudder to think what they decide the correct interpretations of a spec written in prose is.



I would love to see a prompt where it fails such a thing. Do you have an example?


Lisp quotes are confusing in prose.


Still better than my coworkers ...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: