Shouldn't you be getting the LLM to also generate test cases to drive the code a...

kace91 · 2025-09-30T11:56:13 1759233373

The problem is similar to that of journalism vs social media hoaxes.

An llm-assisted engineer writes code faster than a careful person can review.

Eventually the careful engineers get ran over by the sheer amount of work to check, and code starts passing reviews when it shouldn’t.

It sounds obvious, that careless work is faster than careful one, but there are psychological issues in play - expectation by management of ai as a speed multiplier, personal interest in being perceived as someone who delivers fast, concerns of engineers of being seen as a bottleneck for others…

int_19h · 2025-10-01T01:12:50 1759281170

> expectation by management of ai as a speed multiplier

In many cases, it's more than expectation. For top management especially, these are the people who have signed off on massive AI spending on the basis that it will improve productivity. Any evidence to the contrary is not just counter to their expectations - it's a giant flashing neon sign screaming "YOU FUCKED UP". So of course organizations run by those people are going to pretend that everything is fine, for as long as anything works at all.

And then the other side of this is the users. Who have already been conditioned to shrug at crappy software because we made that the norm, and because the tech market has so many market-dominant players or even outright monopolies in various niches that users often don't have a meaningful choice. Which is a perfect setup for slowly boiling the frog - even if AI is used to produce sloppy code, the frog is already used to hot water, and already convinced that there's no way out of the pot in any case, so if it gets hotter still they just rant about it but keep buying the product.

Which is to say, it is a shitshow, but it's a shitshow that can continue for longer than most engineers have emotional capacity to sustain without breaking down. In the long term, I expect AI coding in this environment to act as a filter: it will push out all the people who care about quality and polish out of the industry, and reward those who treat clicking "approved" on AI slop as their real job description.

wongarsu · 2025-09-30T12:06:19 1759233979

I have no issue getting LLMs to generate documentation, modular designs or test cases. Test cases require some care; just like humans LLMs are prone to making the same mistake in both the code and the tests, and LLMs are particularly prone to not understanding whether it's the test or the code that's wrong. But those are solvable.

The things I struggle more with when I use LLMs to generate entire features with limited guidance (so far only in hobby projects) is the LLM duplicating functionality or not sticking to existing abstractions. For example if in existing code A calls B to get some data, and now you need to do some additional work on that data (e.g. enriching or verifying) that change could be made in A, made in B, or you could make a new B2 that is just like B but with that slight tweak. Each of those could be appropriate, and LLMs sometimes make hillariously bad calls here

fzeroracer · 2025-09-30T12:11:31 1759234291

The LLM will generate test cases that do not test anything or falsely flag the test as passing, which means you need to deeply review and understand the tests as well as the code it's testing. Which goes back to the point in the article, again.

jf22 · 2025-09-30T13:46:04 1759239964

>which means you need to deeply review and understand the tests as well as the code it's testing

Yes...? Why wouldn't you always do this LLM or not?

injidup · 2025-10-02T17:13:11 1759425191

I came back to this. NOBODY mentioned speckit!

https://github.com/github/spec-kit

""" Spec-Driven Development flips the script on traditional software development. For decades, code has been king — specifications were just scaffolding we built and discarded once the "real work" of coding began. Spec-Driven Development changes this: specifications become executable, directly generating working implementations rather than just guiding them. """

The takeaway is that instead of vibecoding you write specs and you get the LLM to align the generated code to the specs.

tgv · 2025-09-30T12:33:04 1759235584

I just wrote a reply elsewhere, but we got a new vibe-coded (marketing) website. How is an LLM going to write test cases for that? And what good will they do? I assume it will also change the test cases when you ask it to rewrite things.

fragmede · 2025-09-30T13:06:53 1759237613

> How is an LLM going to write test cases for that?

"Please generate unit tests for the website that exercise documented functionality" into the LLM used to generate the website should do it.

Herring · 2025-09-30T12:53:07 1759236787

The people who are doing that aren't writing these blog posts. They're writing much better code & faster, while quietly internally panicking a bit about the future.

Agraillo · 2025-09-30T15:56:07 1759247767

You're probably talking about infamous Dark Matter Developers [1]. When the term was coined, I thought there were many of them, now seeing how many developers are here at HN (including myself) I doubt there are many left /s.

The quote that is interesting in the context of the fast-pacing LLM development is this

> The Dark Matter Developer will never read this blog post because they are getting work done using tech from ten years ago and that's totally OK

[1] https://www.hanselman.com/blog/dark-matter-developers-the-un...