Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Agreed 100%. For those of us who have already spent ungodly hours creating hyper-detailed specifications for AI, the take that this is the solution to working with AI coding agents seems ridiculously naive. For context, I've also seen this behavior in Claude Code, and despite initially being extremely bullish on the technology, it's almost convinced me that it just isn't ready for prime time no matter what the hucksters might tell you. When you start seeing this you quickly realize that it doesn't really matter how many guardrails you put in place, or how detailed your specification is, if your coding agent randomly decides to ignore your rules or specifications(even in 'brand new context' scenarios). I've lost track of how many times I've asked Claude why did you do this, when it expressly says to do the opposite in the Claude.md file(including words like 'important' or 'critical'), or a specification document that you read right before implementing with a brand new context. Naturally, Claude's reply will be some variation of 'You're absolutely right to call me out on this. I should have done it the way it was spelled out in the specification.'


I have tripwires in my codebase for when Claude tries to run benchmarks with mock/synthetic data because it had a hard time getting the benchmark to run and decided to yeet it, to avoid potential scientific credibility issues, LOL. You can put the system on rails, but it's an engineering problem, these things are noisy program emitters with some P(correct|context), you can model them as noisy channels and use the same error correcting codes to create channels with arbitrarily low noise.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: