I have pretty much the same amount of confidence when I receive AI generated or non-AI generated code to review: my confidence is based on the person guiding the LLM, and their ability to that.
Much more so than before, I'll comfortably reject a PR that is hard to follow, for any reason, including size. IMHO, the biggest change that LLMs have brought to the table is that clean code and refactoring are no longer expensive, and should no longer be bargained for, neglected or given the lip service that they have received throughout most of my career. Test suites and documentation, too.
(Given the nature of working with LLMs, I also suspect that clean, idiomatic code is more important than ever, since LLMs have presumably been trained on that, but this is just a personal superstition, that is probably increasingly false, but also feels harmless)
The only time I think it is appropriate to land a large amount of code at once is if it is a single act of entirely brain dead refactoring, doing nothing new, such as renaming a single variable across an entire codebase, or moving/breaking/consolidating a single module or file. And there better be tests. Otherwise, get an LLM to break things up and make things easier for me to understand, for crying out loud: there are precious few reasons left not to make reviewing PRs as easy as possible.
So, I posit that the emotional reaction from certain audiences is still the largest, most exhausting difference.
The code I've seen generated by others has been pretty terrible in aggregate, particularly over time as the lack of understanding and coherent thought starts to show. Quite happy without it thanks, haven't seen it adding value yet.
Or is the bad code you've seen generated by others pretty terrible, but the good code you've seen generated by others blends in as human-written?
My last major PR included a bunch of tests written completely by AI with some minor tweaking by hand, and my MR was praised with, "love this approach to testing."
I think maybe there's another step too - breaking the design up into small enough peices that the LLM can follow it, and you can understand the output.
So do all the hard work yourself and let the AI do some of the typing, that you’ll have to spend extra time reviewing closely in case its RNG factor made it change an important detail. And with all the extra up front design, planning, instructions, and context you need to provide to the LLM I’m not sure I’m saving on typing. A lot of people recommend going meta and having LLMs generate a good prompt and sequence of steps to follow, but I’ve only seen that kinda sorta work for the most trivial tasks.
Unless you're doing something fabulously unique (at which point I'm jealous you get to work on such a thing), they're pretty good at cribbing the design of things if it's something that's been well documented online (canonically, a CRUD SaaS app, with minor UI modification to support your chosen niche).
And if you are doing something fabulously unique, the LLM can still write all the code around it, likely help with many of the components, give you at least a first pass at tests, and enable rapid, meaningful refactors after each feature PR.
I don't really understand your point. It reads like you're saying "I like good code, it doesn't matter if it comes from a person or an LLM. If a person is good at using an LLM, it's fine." Sure, but the problem people have with LLMs is their _propensity_ to create slop in comparison to humans. Dismissing other people's observations as purely an emotional reaction just makes it seem like you haven't carefully thought about other people's experiences.
My point is that, if I can do it right, others can too. If someone's LLM is outputing slop, they are obviously doing something different: I'm using the same LLMs.
All the LLM hate here isn't observation, it's sour grapes. Complaining about slop and poor code quality outputs is confessing that you haven't taken the time to understand what is reasonable to ask for, aren't educating your junior engineers how to interact with LLMs.
Much more so than before, I'll comfortably reject a PR that is hard to follow, for any reason, including size. IMHO, the biggest change that LLMs have brought to the table is that clean code and refactoring are no longer expensive, and should no longer be bargained for, neglected or given the lip service that they have received throughout most of my career. Test suites and documentation, too.
(Given the nature of working with LLMs, I also suspect that clean, idiomatic code is more important than ever, since LLMs have presumably been trained on that, but this is just a personal superstition, that is probably increasingly false, but also feels harmless)
The only time I think it is appropriate to land a large amount of code at once is if it is a single act of entirely brain dead refactoring, doing nothing new, such as renaming a single variable across an entire codebase, or moving/breaking/consolidating a single module or file. And there better be tests. Otherwise, get an LLM to break things up and make things easier for me to understand, for crying out loud: there are precious few reasons left not to make reviewing PRs as easy as possible.
So, I posit that the emotional reaction from certain audiences is still the largest, most exhausting difference.