That's where we come in of course! Look into spec-driven development. You basically encourage the LLM to ask questions and hash out all these details.
> Who said anything about the browser?... Don't defend these bugs.
A problem of insufficient specification... didn't expect an HN comment to turn into an engineering exercise! :-) But these are the kinds of things you'd put in the spec.
> How long does it take to debug code like this when an AI generates it? It took me about 25 minutes to discover & verify those problems.
Here's where it gets interesting: before reviewing any code, I basically ask it to generate tests, which always all pass. Then I review the main code and test code, at which point I usually add even more test-cases (e.g. https://news.ycombinator.com/item?id=46143454). And, because codegen is so cheap, I can even include performance tests, (which statistically speaking, nobody ever does)!
Here's a one-shot result of that approach (I really don't mean to take up more of your time, this is just so you can see what it is capable of):
https://pastebin.com/VFbW7AKi
While I do review the code (a habit -- I always review my own code before a PR), I review the tests more closely because, while boring, I find them a) much easier to review, and b) more confidence-inspiring than manual review of intricate logic.
> I want chatgpt to solve problems its probably never seen before, with all the context of the problem I'm trying to solve, to fit in with all the code we've already written.
Totally, and again, this is where we come in! Still, it is a huge productivity booster even in uncommon contexts. E.g. I'm using it to do computer vision stuff (where I have no prior background!) with opencv.js for a problem not well-represented in the literature. It still does amazingly well... with the right context. It's initial suggestions were overindexed on the common case, but over many conversations, it "understood" my use-case and consistently gives appropriate suggestions. And because it's vision stuff, I can instantly verify results by sight.
Big caveat: success depends heavily on the use-case. I have had more mixed results in other cases, such as concurrency issues in an LD_PRELOAD library in C. One reason for the mixed sentiments we see.
> ChatGPT is increasingly good at doing the work of a junior developer.
Yes, and in fact, I've been rather pessimistic about the prospects of junior developers, a personal concern given I have a kid who wants to get into software engineering...
I'll end with a note that my workflow today is extremely different from before AI, and it took me many months of experimentation to figure out what worked for me. Most engineers simply haven't had the time to do so, which is another reason we see so many mixed sentiments. But I would strongly encourage everybody to invest the time and effort because this discipline is going to change drastically really quickly.
That's where we come in of course! Look into spec-driven development. You basically encourage the LLM to ask questions and hash out all these details.
> Who said anything about the browser?... Don't defend these bugs.
A problem of insufficient specification... didn't expect an HN comment to turn into an engineering exercise! :-) But these are the kinds of things you'd put in the spec.
> How long does it take to debug code like this when an AI generates it? It took me about 25 minutes to discover & verify those problems.
Here's where it gets interesting: before reviewing any code, I basically ask it to generate tests, which always all pass. Then I review the main code and test code, at which point I usually add even more test-cases (e.g. https://news.ycombinator.com/item?id=46143454). And, because codegen is so cheap, I can even include performance tests, (which statistically speaking, nobody ever does)!
Here's a one-shot result of that approach (I really don't mean to take up more of your time, this is just so you can see what it is capable of): https://pastebin.com/VFbW7AKi
While I do review the code (a habit -- I always review my own code before a PR), I review the tests more closely because, while boring, I find them a) much easier to review, and b) more confidence-inspiring than manual review of intricate logic.
> I want chatgpt to solve problems its probably never seen before, with all the context of the problem I'm trying to solve, to fit in with all the code we've already written.
Totally, and again, this is where we come in! Still, it is a huge productivity booster even in uncommon contexts. E.g. I'm using it to do computer vision stuff (where I have no prior background!) with opencv.js for a problem not well-represented in the literature. It still does amazingly well... with the right context. It's initial suggestions were overindexed on the common case, but over many conversations, it "understood" my use-case and consistently gives appropriate suggestions. And because it's vision stuff, I can instantly verify results by sight.
Big caveat: success depends heavily on the use-case. I have had more mixed results in other cases, such as concurrency issues in an LD_PRELOAD library in C. One reason for the mixed sentiments we see.
> ChatGPT is increasingly good at doing the work of a junior developer.
Yes, and in fact, I've been rather pessimistic about the prospects of junior developers, a personal concern given I have a kid who wants to get into software engineering...
I'll end with a note that my workflow today is extremely different from before AI, and it took me many months of experimentation to figure out what worked for me. Most engineers simply haven't had the time to do so, which is another reason we see so many mixed sentiments. But I would strongly encourage everybody to invest the time and effort because this discipline is going to change drastically really quickly.