Hacker Newsnew | past | comments | ask | show | jobs | submit | ben30's commentslogin

The political circus is drowning out some pretty clear science here. Let me break this down without the academic jargon:

The basic problem: Most studies can't tell the difference between the medicine and why you're taking it. If you're having Tylenol during pregnancy, it's probably because you have a fever, infection, or severe pain. Guess what also increases autism risk? Fever, infections, and severe illness.

What makes the Swedish study special: They compared siblings in the same family. Same genes, same environment, same parents - but one child was exposed to acetaminophen in the womb and the other wasn't. This controls for all the family-level stuff that usually confuses these studies.

The numbers tell the story: - Regular studies: "5% increased autism risk with acetaminophen" (HR 1.05) - Swedish sibling comparison: "Actually, no increased risk" (HR 0.98, could be 7% protective to 4% harmful - basically noise) - Meanwhile, untreated fever: 40% increased risk, multiple fevers: 212% increased risk

We have evidence that fever during pregnancy messes with fetal brain development. We have the best study ever done showing acetaminophen doesn't cause autism. So we're going to... stop treating the fever?

It's like refusing to use a fire extinguisher because you're worried it might stain your carpet, while your house burns down.

The Swedish study should have ended this debate. When the science is done correctly, the acetaminophen "risk" vanishes completely.

Sources:

- Swedish study: https://jamanetwork.com/journals/jama/fullarticle/2817406

- Fever-autism evidence: https://molecularautism.biomedcentral.com/articles/10.1186/s...


> The Swedish study should have ended this debate.

I agree with everything you’ve said except this statement.

I’m of the opinion that a single study should never end debate. It may inform policy, sure, but no end debate. Certainly not unless and until it has been replicated by others.


Fair point on the "ended debate" phrasing - that was imprecise on my part. What I should have said is "the Swedish study provides the strongest evidence to date and shifts the burden of proof." It's not actually a single study though. The pattern is consistent across study quality levels:

Population studies (many): Small associations, but can't control for confounding

Negative control studies (several): Associations weaken when using better controls

Sibling studies (multiple, including Swedish): Associations disappear entirely

Meanwhile, fever studies (dozens): Consistent risk signals across different populations

The Swedish study is just the largest and best-designed in a hierarchy of evidence that all points the same direction. When you see this "dose-response by study quality" pattern - where better methodology consistently yields weaker effects - it's usually a strong signal that the original association was artifactual.

The Economist piece published yesterday reinforces this. They mention the NIH study of 200,000 children that "found no link at all" - that's another high-quality study reaching the same conclusion. Meanwhile, the studies showing associations (Nurses' Health Study II, Boston Birth Cohort) are exactly the type of population studies that can't control for the fever/infection confounding.

Science is never "settled" in an absolute sense, but the weight of evidence here is pretty clear. We're not waiting for more acetaminophen studies - we're ignoring the ones we already have while making policy based on weaker evidence.

That's the real problem with the current policy shift.


> Fair point on the "ended debate" phrasing - that was imprecise on my part.

Oh, no worries. I was fairly certain I understood what you meant. Honestly that part of my comment was intended for others reading it, as it certainly seems that many people do believe a single peer-reviewed study should end the debate.

> the Swedish study provides the strongest evidence to date and shifts the burden of proof

100% agree :)

> It's not actually a single study though.

Unless I'm missing something, it is. It looks at a single population (Swedish children born between 1995 and 2019) that is divided into multiple cohorts. This approach strikes me as entirely valid -- but it also weakens the strength of the signal that it provides. With a population of this size and number of recorded attributes, there are likely cohorts that could be found to support any hypothesis the author would like. There are almost certainly many that would meet the bar of statistical significance if you're willing to form the hypothesis based on the data.

In other words, my initial impression is that it's potentially a variant of "P-hacking", regardless of intent. Unless the hypothesis was formed a priori, recorded, and not modified the results are evidence that a pattern may exist but not proof that it does.

> The Swedish study is just the largest and best-designed in a hierarchy of evidence that all points the same direction

From my perspective -- and to be clear, that's very much a lay perspective! -- I agree, and that direction is "there is likely a correlation between the use of acetaminophen during pregnancy and childhood autism diagnosis".

... but at the risk of being tiresome, correlation is not causation. My (unproven!) hypothesis at this point is that both higher rates of autism and acetaminophen use are a result of persistent fevers, which itself is likely a result of chronic systemic inflammation.

If that is in fact the case, then it would simultaneously be true that acetaminophen use would be a strong leading indicator of autism and that ceasing the use of acetaminophen during pregnancy would actually _increase_ the rate of autism overall.


This mirrors exactly what we learned from outsourcing over the past two decades. The successful teams weren’t those with the best offshore developers - they were the ones who mastered writing unambiguous specifications.

AI coding has the same bottleneck: specification quality. The difference is that with outsourcing, poor specs meant waiting weeks for the wrong thing. With AI, poor specs mean iterating indefinitely on the wrong thing.

The irony is that AI is excellent at helping refine specifications - identifying ambiguities, expanding requirements, removing assumptions. The specification effectively IS the code, just in human language instead of syntax.

Teams that struggled with distributed development are repeating the same mistakes with AI. Those who learned specification discipline are thriving because they understand that clear requirements determine quality output, regardless of the implementer.


Makes me wonder if leadership will bounce back from vibe coding faster than it did from outsourcing?

I wasn't around then but colleagues told me it took years for leadership to understand what's happening and to turn the ship around.


And the ship is only turned around for a brief period of time because the next gen mbas will restart the outsourcing cycle. The allure of replacing your most expensive employees at one third the cost regardless of quality impacts is just too tempting to pass up.


While studying well-designed codebases is incredibly valuable, there's an important "tip of the iceberg" effect to consider: much of good software design lives in the "negative space" - what's deliberately not there.

The decisions to exclude complexity, avoid premature abstractions, or reject certain patterns are often just as valuable as the code you can see. But when you're studying a codebase, you're essentially seeing the final edit without the editor's notes - all the architectural reasoning that shaped those choices is invisible.

This is why I've started maintaining Architectural Decision Records (ADRs) in my projects. These document the "why" behind significant technical choices, including the alternatives we considered and rejected. They're like technical blog posts explaining the complex decisions that led to the clean, simple code you see.

ADRs serve as pointers not just for future human maintainers, but also for AI tools when you're using them to help with coding. They provide readable context about architectural constraints and compromises - "we've agreed not to do X because of Y, so please adhere to Z instead." This makes AI assistance much more effective at respecting your design decisions rather than suggesting patterns you've deliberately avoided.

When studying codebases for design patterns, I'd recommend looking for projects that also maintain ADRs, design docs, or similar decision artifacts. The combination of clean code plus the architectural reasoning behind it - especially the restraint decisions - provides a much richer learning experience.

Some projects with good documentation of their design decisions include Rust's RFCs, Python's PEPs, or any project following the ADR pattern. Often the reasoning about what not to build is more instructive than the implementation itself.


Oooh I like that idea. I may steal it. I'll be on the lookout for documents like that. It'll be interesting to see what patterns/designs were avoided. There are so many ways to accomplish the same thing it might be nice to set some limits.


https://daringfireball.net/thetalkshow/2025/03/23/ep-419

They spoke through the options back in March.


The irony is quite striking, just as ChatGPT can generate confident-sounding but inaccurate information, Altman appears to be presenting unsubstantiated claims about his company’s environmental impact. Both involve presenting information without reliable backing, though the consequences differ - one misleads users in conversations, the other potentially misleads stakeholders and the public about environmental responsibility.


This echoes my experience with Claude Code. The bottleneck isn't the code generation itself—it's two critical judgment tasks:

1. Problem decomposition: Taking a vague idea and breaking it down into well-defined, context-bounded issues that I can effectively communicate to the AI

2. Code review: Carefully evaluating the generated code to ensure it meets quality standards and integrates properly

Both of these require deep understanding of the domain, the codebase, and good software engineering principles. Ironically, while I can use AI to help with these tasks too, they remain fundamentally human judgment problems that sit squarely on the critical path to quality software.

The technical skill of writing code has been largely commoditized, but the judgment to know what to build and how to validate it remains as important as ever.


This would be at least the third time in history we've tried to shunt writing code to low paid labor. We'll see if it's successful this time.

The problem tends to be that small details affect large details which affect small details. If you aren't good at both you're usually shit at both.


The problem wasn't low paid labor, it was just incompetent labor. You can find competent developers in all these countries offering lower pay, India, Brazil, Romania, Poland, China, Pakistan, its just that they would already be hired by other higher paying companies and what is left for the ones that are looking for the lowest paid possible workers are the incompetent ones.


>its just that they would already be hired by other higher paying companies and what is left for the ones that are looking for the lowest paid possible workers are the incompetent ones

Reminds of me working in IT. One company tried to outsource my job to India five different times before they were mostly successful at it. The companies that are successful aren't the ones that assume it'll cost 1/10th the price, they are the ones that know it'll cost 60+% of the price and still require some handholding.

If you're hiring on price alone, you're already selecting the pool that doesn't contain the most competent labor.


"Never buy the cheapest version of something." I don't remember who told me that, but it was good advice. There's always a reason.


IMO attempts to make it low paid work will fail, just like almost every STEM profession. But... the number of engineers that we need who operate as "power multipliers" on team will continue to decrease. Many startup and corporate teams already aren't needing junior/mid level engineers any longer.

They just need "drivers", senior/lead/staff engineers that can run independent tracks. AI becomes the "power multiplier" in the teams who amplify the effects of the "driver".

Many people pretend that 10x engineers don't exist. But anyone who has worked on an adequately high performing team at a large (or small) company knows that skill, and quite frankly intelligence, operate on power laws.

The bottom 3 quartiles will be virtually unemployable. Talent in the top quartile will be impossible to find because they're all employed. Not all that unlike today, though which quartile you fall into is largely going to depend on how "great" of an engineer you are AND how effectively you use AI.

As this happens, the tap of new engineers who are learning how to make it into the top quartile, will cutoff for everyone except for those who are passionate/sadistic enough to programming without AI, then learn to program WITH AI.

Meanwhile the number of startups disrupting corporate monopolies will increase as the cost of labor goes down due to lower headcount requirements. Lower head counts will lead to better team communication and in general business efficiency.

At some point the upper quartile will get automated too. And with that, corporate moats evaporate to solo-entrepreneurs and startups. The ship is sinking, but the ocean is about to boil too. When economic formulas start dividing by zero, we can be pretty sure that we can't predict the impact.


Solopreneurs won't be able to effectively lobby govt regulators


i have significantly more faith in school children regulating an insane asylum


Someone told me AI was like having a bunch of junior coders. You have to be very explicit in telling it what to do and have to go through several iterations to get it right. Though it was cheaper.


That matches my experience.

Decomposing a problem so that it is solvable with ease is what I enjoy most about programming and I am fine with no longer having to write as much code myself, but resent having to review so much more.

Now, how do we solve the problem of people blindly accepting what an LLM spat out based on a bad prompt. This applies universally [0] and is not a technological problem.

0 - https://www.theverge.com/policy/677373/lawyers-chatgpt-hallu...


Agreed on the review burden being frustrating. Two strategies I've found helpful for managing the cognitive load:

1. Tight issue scoping: Making sure each issue is narrowly defined so the resulting PRs are small and focused. Easier to reason about a 50-line change than a 500-line one.

2. Parallel PR workflow: Using git worktrees to have multiple small PRs open simultaneously against the same repo. This lets me break work into digestible chunks while maintaining momentum across different features.

The key insight is that smaller, well-bounded changes are exponentially easier to review thoroughly. When each PR has a single, clear purpose, it's much easier to catch issues and verify correctness.

Im finding these workflow practices help because they force me to engage meaningfully with each small piece rather than rubber-stamping large, complex changes.


> The key insight is that smaller, well-bounded changes are exponentially easier to review thoroughly.

I am not sure if that is the real insight. It appears to me that most people prefer small, well-bounded changes, but it's quite tricky to break down large tasks into small but meaningful changes, isn't it? To me, that appears to be the key.


Exactly - and that's where AI becomes really valuable as a thinking partner. I use Claude Code to have conversations with my codebase about how to slice problems down further.

The issue definition itself becomes something you can iterate on and refactor, just like code. Getting that definition tightly bounded is more critical than ever because without clear boundaries, the AI doesn't know when to stop or what constitutes "done."

It's like having a pair programming session focused purely on problem decomposition before any code gets written. The AI can help you explore different ways to break down the work, identify dependencies, and find natural seams in the problem space.


So really the same two skills that a senior engineer needs to delegate tasks to juniors & review the results..


Nope, dealing with juniors is way less frustrating because they learn. So overtime, you can increase the complexity of their tasks until they're no longer junior.


Agreed on that point, and my question for a lot of the AI bros has been "what would you actually do with unlimited interns who never improve much?"

For me, not much! Others may differ.

In my own experience interns are a net drag. New college hires flip positive after 3-6 months.. if they are really good. Many takes upwards of a year.


> In my own experience interns are a net drag. New college hires flip positive after 3-6 months.. if they are really good. Many takes upwards of a year.

And mostly their output is not really about incorrect code, but more likely incorrect approaches. By reviewing their code, you find gaps in their knowledge which you can then correct. They're here to learn, not to produce huge amount of code. The tasks are more for practice and evaluation than things you critically need.

I don't want to work with a junior, but I'm more than happy to guide them to be someone I can work with.


I do agree that "unlimited interns who don't improve much" is less practically useful than it might seem at first, but OTOH "never improve much" seems unrealistic, given the insane progress of the field in the last 3ish years (or think back 5 years and tell me who was realistically predicting tools like Claude Code to even exist by 2025).

Also, there's a decently large subset of small startups where there's 1 technical founder and a team of contract labor, trying to build that first MVP or cranking out early features in a huge rush to stay alive, where yeah, cheap unlimited interns might actually be meaningfully useful or economically more attractive than whatever they're doing now. Founders kind of have a perverse incentive, where a CTO doesn't need to solo code the first MVP, and also doesn't need to share/hand-out equity or make early hires quittteee as early, if unlimited interns can scale that CTO's solo productivity for a bit longer than the before-times.


> but OTOH "never improve much" seems unrealistic, given the insane progress of the field in the last 3ish years

The point is that no one should hire an intern or a junior because they think it will improve their team's productivity. You hire interns and juniors because there's a causal link between "I hired an intern and spent money training them" and "they joined my company full time and a year later are now productive, contributing members of the team". It's an investment in the future, not a productivity boost today.

There is no causal link between "I aggressively adopted Claude Code in 2025" and "Claude Code in 2026 functions as a full software engineer without babysitting". If I sit around and wait a year without adopting Claude Code that will have no measurable impact on Claude Code's 2026 performance, so why would I adopt it now if it's still at intern- or junior-level skill?

If we accept that Claude is a junior-level contribution then the rational move is to wait and watch for now and only adopt it in earnest if and when it uplevels.


Precisely - AI getting better or not has nothing to do with my burning cycles using it. My juniors do improve based on my effort. I can free ride on AI getting good enough later (wait) whereas I cannot with my own team of juniors.

> 1 technical founder and a team of contract labor, trying to build that first MVP or cranking out early features in a huge rush

Having worked in environments with a large number of junior contractors... this is generally a recipe for a lot of effort with resulting output that neither works technically nor actually delivers features.


To your last point -- I didn't say large number of junior contractors would write good code or whatever. The change that is happening in the startup scene now, as compared to say 10 years ago, is more about lowering the barrier to MVP and making it easier/cheaper for startups to experiment with finding product market fit, than anything to do with "productivity" or code quality or whatever.

We're probably just talking past each other, because the thing you care about is not the thing I care about. I am saying that, it used to cost some reference benchmark of $X/idea to iterate as a startup and experiment with ideas, but then it became 0.5X because gig workers or overseas contractors became more accessible and easier to work with, and now it's becoming 0.1X because of LLMs and coding agents. I am not making any sort of argument about quality being better/good/equal, nor am I making any sort of conversion chart between 10 interns or 100 LLM agents equals 1 senior engineer or something... Quality is rarely (never?) the deciding factor, when it comes to early pre-seed iteration as a startup tries to gasp and claw for something resembling traction. Cost to iterate as well as benefits of having more iterations, can be improving, even if each iteration's quality level is declining.

I'm simply saying, if I was a founder, and I had $10k to spend to test new ideas with, I can test a helluva lot more ideas today (leveraging AI), vs what I could have done 5 years ago (using contractors), vs what I could have done 10-20 years ago (hiring FTEs, just to test out ideas, is frankly kind of absurd when you think about how expensive that is). I am not saying that $10k worth of Claude Code is going to buy me a production grade super fantastic amazing robust scalable elegant architecture or whatever, but it sure as heck can buy me a good enough working prototype and help me secure a seed round. Reducing that cost of experimentation is the real revolution (and whether interns can learn or will pay off over time is a wholly orthogonal topic that has no bearing to this cost of experimentation revolution).


Yeah in this context I get what you are talking about. I got through your first paragraph and thought of the startup founders using overseas / gig workers a decade ago to test ideas.. which is exactly where you went!


"hired an intern and spent money training them" and "they joined my company full time and a year later are now productive"

Why would I do that if I can have sombody else pay for the training then poach them when they are ready?


What are you doing to actually keep them at your company? I left a company after they invested a lot in training me. They gave me very little raises and already paid poorly, no guaranteed bonus, bad vacation hours, and no opportunities for promotion. They were shocked when I left, even though I had asked for very modest raises and was way more productive than the "seniors" at the company.

Most companies outside of FAANGs treat their talented juniors like crap, so of course they'll leave.


Which is exactly why no one's hiring juniors anymore. It made sense back when the market for hiring engineers was super competitive and it was easier to gamble on being able to keep a junior than it was to try to snag a senior. But now that there are seniors galore on the market who would bother with a junior?


> Also, there's a decently large subset of small startups where there's 1 technical founder and a team of contract labor, trying to build that first MVP or cranking out early features in a huge rush to stay alive, where yeah, cheap unlimited interns might actually be meaningfully useful or economically more attractive than whatever they're doing now

That's when experienced developers are a huge plus. They know how to cut corners in a way that will not hurt that much in the long term. It's more often intern level that are proposing stuff like next.js, kubernetes, cloud-native,... that will grind you to a halt once the first bugs appear.

A very small team of good engineers will get you much further than any army of intern level coders.


Yeah "actually good engineers" are like a 10:1 ratio with intern/new college hire/junior consultant level.

Not to generalize too much but if you are contracting out to some agency for junior levels, you are generally paying markup on coders who couldn't find better direct hire jobs to start with. At least with mid/senior level you can get into more of a hired-gun deal for someone who is between gigs/working part time/buy a share of their time you couldn't afford full-time.

In fact most junior consultants you are basically paying for the privilege to train other peoples employees who will then be billed at a higher rate back to you when they improve.. if they don't move on otherwise.


Disagree. Some learn, not all and decreasing numbers career to learn

Also most juniors have no idea how to write tests, plan for data scale, know which IPC-RPC combo is best for prototyping vs production

Etc…

90% of software is architecture and juniors don’t architect


> Disagree. Some learn, not all and decreasing numbers career to learn

This is an organizational issue then—someone who is operating at a junior level who demonstrates that they don’t care to learn should be let go.


We’re saying the same thing

The business threshold (willingness to pay for something) for the worst automation will eventually beat the marginal expert.

So there becomes no business differentiation between a junior and a middle engineer

“Architecture” becomes the entry-level job


But they are so cheap, and they increase the headcount on my pfiefdom chart.


Yes, a combination of empire building and "but $X exceeds the $Y cap set by HR for Z role! / we can hire XX juniors for this price!" type of mega corp thinking.


Bingo! Not to mention that "dealing with juniors" is one of the critical ways for a senior engineer to grow.


This is exactly how to use it and exactly why it’s a huge deal

In my experience so far, the people that aren’t getting value out of LLM code assistants, fundamentally like the process of writing code and using the tooling

All of my senior, staff, principals love it because we can make something faster than having to deal with a junior because it’s trivial to write the spec/requirement for Claude etc…


> All of my senior, staff, principals love it because we can make something faster than having to deal with a junior because it’s trivial to write the spec/requirement for Claude etc…

How will you make new senior, staff, and principal engineers without "having to deal with a junior"?


You hire seniors from other companies ofc.


You don’t in the long term

It’s just like “calculator” used to be a manual human job in engineering

Los Alamos, NASA etc… literally had 100s of individual humans running long calculations that computers didn’t have the memory to handle

There are no more human computers


> In my experience so far, the people that aren’t getting value out of LLM code assistants, fundamentally like the process of writing code and using the tooling

> All of my senior, staff, principals love it because we can make something faster than having to deal with a junior because it’s trivial to write the spec/requirement for Claude etc…

Hm, interesting. As someone who has found zero joy and value in using LLMs, this rings true to me. Setting aside the numerous glaring errors I get every time I try to use one, even if the tools were perfect, I don't think I would enjoy using them. I enjoy programming, thinking about how to break down a problem and form abstractions and fit those into the tools the language I'm writing in gives me. I enjoy learning and using the suite of Unix tools like grep and sed and vim to think about how to efficiently write and transform code. The end product isn't the fun part, the fun part is making the end product. If software engineering just becomes explaining stuff in English to a machine and having some software pop out... then I think the industry just isn't for me anymore. I don't want to hand the fun part over to a machine.

It's like how I enjoy going to my wood shop to build tables instead of going to Ikea. It would be cheaper and faster and honestly maybe even better quality to go to Ikea, but the joy is in the knowledge and skill it takes to build the table from rough lumber.


"As someone who has found zero joy and value in using LLMs"

I love programming but find zero joy in front-end coding. For me LLM's solved that bit nicely. I'm sure a real webdev would do better, but I can't afford it for my personal projects and the LLM helped me to get it done more than good=enoug for my needs.


Software at scale is a business function

You’re describing a hobby/artistry process

You can still do all that, the same way that you can still build a table at your house.

But the number of number of handmade table builders is going to drop to effectively zero for the majority of table building going forward


I'm not sure I agree that will happen, but if it does, then yeah like I said, it's probably my cue to exit the industry. If the fun goes out of the job, there's other things I'd rather do than sit inside alone and stare at a screen all day.


What the heck, the code generation _is_ absolutely still a bottle-neck.

I dare anyone who making these arguments that LLMs have removed the need for actual programming skill, for example, to share in a virtual pair programming session with me, and I will demonstrate their basic inability to do _any_ moderately complex coding in short order. Yes, I think that's the only way to resolve this controversy. If they have some magic sauce for prompting, they should post a session or chat that can be verified by other (even if not exactly repeatable).

Yesterday almost my whole day was wasted because I chose to attack a problem primarily by using Claude 4 Sonnet. Having to hand hold it every step of the way, continually keep correcting basic type and logic errors (even ones I had corrected previously in the same session), and in the end it just could solve the challenge I gave it.

I have to be cynical and believe those shouting about LLMs taking over technical skill must have lots of stock in the AI companies.


Indeed.

All this “productivity” has not resulted in one meaningful open source PR or one interesting indie app launch, and I can’t square my own experience with the hype machine.

If it’s not all hat and no cattle, someone should be able to show me some cows.


>I can’t square my own experience with the hype machine.

Me neither. My gut feeling is it's the inexperienced who gain the most from generative AI. That does seem to be confirmed by papers like this:

https://mitsloan.mit.edu/ideas-made-to-matter/workers-less-e...

At most I've found it helps with some of the routine work but saving a few minutes typing doesn't offset the problems it creates.


I find it hard to believe the inexperienced would benefit at all. Ai assisted coding requires serious general experience in all matters software to get good value out of it.


I find this hard to believe- how would you even know if someone used AI in producing a PR or indie product? Are you omniscient?

Further, there are articles here on HN all the time about people using AI for actual serious work. Heres a pretty significant example :

https://sean.heelan.io/2025/05/22/how-i-used-o3-to-find-cve-...


I dunno, as an engineer who likes to make side projects, I can say with high certainty that LLMs have helped me compensate for things I'm worse at when coding a product.

I'm good at the engineering side of things, I'm good at UI, I'm good at UX, I'm good at css, I'm just not good at design.

So I tell the LLM to do it for me. It works incredibly well.

I don't know if it's a net increase in productivity for me, but I am absolutely certain that it is a net increase in my ability to ship a more complete product.


That makes perfect sense to me. I’m finding real value in natural language search for code and docs, and “remind me how to do X.”

It’s the extraordinary claims of 10x speed and crazy autopilot that have me looking around for missing cows.


Why do that when they can ignore you and keep living in their bubble?


> Yesterday almost my whole day was wasted because I chose to attack a problem primarily by using Claude 4 Sonnet

I have been extremely cynical about LLMs up until Claude 4. For the specific project I've been using it on, it's done spectacularly well at specific asks - namely, performance and memory optimization in C code used as a Python library.


Honestly, its mind boggling. Am I the worst prompter ever?

I have three python files (~4k LOC total) that I wanted to refactor with help from Claude 4 (Opus and Sonnet) and I followed Reed Harper's LLM workflow...the results are shockingly bad. It produces an okay plan, albeit full of errors, but usable with heavy editing. In the next step though, most of the code it produced was pretty much unusable. It would've been far quicker for me to just do it myself. I've been trying to get LLMs on various tasks to help me be faster but I'm just not seeing it! There is definitely value in it in helping to straighten out ideas in my head and using it as StackOverflow on roids but that's where the utility starts to hit a wall for me.

Who are these people who are "blown away" by the results and declaring an end to programming as we know it? What are they making? Surely there ought to be more detailed demos of a technology that's purported to be this revolutionary!?

I'm going to write a blog post with what I started with, every prompt I wrote to get a task done and responses from LLMs. Its been challenging to find a detailed writeup of implementing a realistic programming project; all I'm finding is small one off scripts (Simon Willison's blog) and CRUD scaffolding so far.


I couldn't agree more. This has been my exact experience.

Like you I'll probably write a blog post and show, prompt by prompt, just how shockingly bad Claude frequently is. And it's supposed to be one of the best at AI assisted coding, which mean the others are even worse.

That'll either convince people, match their experiences, or show me up to be the worst prompter ever.


I think you're supposed to let the AI write the bad python code and then do the refactoring yourself. No way I'm letting the AI make changes to 150 files with tons of cross-concerns when I don't even fully understand it all myself unless I dig into the code.

That being said copilot and chatgpt have been a 40% productivity boost at least. I just write types that are as tightly fitting as possible, and segregate code based on what side effects are going to happen, stub a few function heads and let the LLM fill in the gaps. I'm so much faster at coding than I was 2-3 years ago. It's like I'm designing the codebase more than writing it.


I don’t think AI marks the end of software engineers, but it absolutely can grind out code for well specified, well scoped problem statements in quarter-minutes that would take a human an hour or so.

To me, this makes my exploration workflow vastly different. Instead of stopping at the first thing that isn’t obviously broken, I can now explore nearby “what if it was slightly different in this way?”

I think that gets to a better outcome faster in perhaps 10-25% of software engineering work. That’s huge and today is the least capable these AI assistants will ever be.

Even just the human/social/mind-meld aspects will be meaningful. If it can make a dev team of 7 capable of making the thing that used to take a dev team of 8, that's around 15% less human coordination needed overall to get the product out. (This might even turn out to be half the benefit of productivity enhancing tools.)


> Instead of stopping at the first thing that isn’t obviously broken, I can now explore nearby “what if it was slightly different in this way?”

What? Software engineering is about problem solving, not finding the first thing that works and called it a day. More often than not, you have too many solutions and the one that's implemented is the result of a list of decisions you've taken.

> If it can make a dev team of 7 capable of making the thing that used to take a dev team of 8, that's around 15% less human coordination needed overall to get the product out.

You should really read the mythical man month.


I credit my understanding of the incredible costs relating to the increased need for coordination and the sharply decreasing return on productivity for additional people to The Mythical Man Month.

I don't take credit for the value of being able to do with 7 what currently takes 8, but rather ascribe it to the ideas of Fred Brooks (and others).


> I have to be cynical and believe those shouting about LLMs taking over technical skill must have lots of stock in the AI companies.

I'm far from being a "vibe" LLM supporter/advocate (if anything I'm the opposite, despite using Copilot on a regular basis).

But, have you seen this? Seems to be the only example of someone actually putting their "proompts" where their mouth is, in a manner of speaking. https://news.ycombinator.com/item?id=44159166


It’s interesting that your point about wasting time makes a second point in your favor as well.

If you don’t have the knowledge that begets the skills to do this work then you would never have known you were wasting your time or at least how to stop wasting time.

LLM fanboys don’t want to hear this but you can’t successfully use these tools without also having the skills.


Edit for the parent comment:

> in the end it just could NOT solve the challenge I gave it.


Last week I was like, I might as well vibe code with free Gemini and steal his credit than researching something destined to be horrible as Android Camera2 API, and found out that at least me using this version of Gemini do better if I prompt it in a... casual language.

"ok now i want xyz for pqr using stu can you make code that do" rather than "I'm wondering if...", with lowercase I and zero softening languages. So as far as my experience goes, tiny details in prompting matter and said details can be unexpected ones.

I mean, please someone just downvote and tell me it's MY skill issue.


I totally just verbalize my inner monologue, swearing and everything. Sometimes I just type "weeeeeeeelllllll" and send it, to get more LLM output or to have it provide alternatives.

It might sound weird but I try to make the LLM comfortable. Because I find you get worse results when you point out mistake after mistake and it goes into apologetic mode. Also because being nice puts me in a better mood and it makes my own programming better.

vibe coding as it were :p


I want to add something to this which is rarely discussed.

I personally value focus and flow extremely highly when I'm programming. Code assistance often breaks and prevents that in subtle ways. Which is why I've been turning it off much more frequently.

In an ironic way, using assistance more regularly helped me realize little inefficiencies, distractions and bad habits and potential improvements while programming:

I mean that in a very broad sense, including mindset, tooling, taking notes, operationalizing, code navigation, recognizing when to switch from thinking/design to programming/prototyping, code organization... There are many little things that I could improve, practice and streamline.

So I disagree with this statement at a fundamental level:

> The technical skill of writing code has been largely commoditized (...)

In some cases, I find setting yourself up to get into a flow or just high focus state and then writing code very effective, because there's a stronger connection with the program, my inner mental model of how it works in a more intricate manner.

To me there are two important things to learn at the moment: Recognizing what type of approach I should be using when and setting myself up to use each of them more effectively.


Just move up an abstraction level and put that flow into planning the features and decomposing them into well defined tasks that can be assigned to agents. Could also write really polished example code to communicate the style and architectural patterns and add full test coverage for it.

I do notice the same lack of flow when using an agent since you have to wait for it to finish but as others have suggested if you set up a few worktrees and have a really good implementation plan you can use that time to get another agent started or review the code of a separate run and that might lend itself to a type of flow where you’re keeping the whole design of the project in your head and rapidly iterating on it.


> Just move up an abstraction level and put that flow into planning the features and decomposing them into well defined tasks that can be assigned to agents

This doesn't work because you still have to read and verify all of the stuff your agents produce

So the new workflow is: Move up an abstraction level to use an agent to produce code Then move down an abstraction level to review the code it produces

This sounds like way more cognitive overhead and way harder (and therefore probably slower) to do than just writing the code by hand in a good flow


There's something fundamentally different between writing the program directly that you visualize in your head versus staying one level away and reviewing someone else's code. I'm really talking about the former.


"these require deep understanding of the domain, the codebase, and good software engineering principles" Most of this AI can figure out eventually, except maybe the domain. But essentially software engineering will look a lot like product management in a few years.


As a (very good I would say) product manager once told me - the product vision and strategy depends very much on the ability to execute. The market doesn't stand still, and what you _can_ do defines very much what you _should_ do.

What I mean to say here is that not even product management is reduced to just "understand the domain" - so it kinda' feels that your entire prediction leans on overly-simplified assumptions.


pretty big logic leap you made there. I didn't say understanding the domain was the only requirement. But certainly not understanding it will cause you to fail.


That's a narrow view of the issue described in the blog post. You're coming at this from the perspective of a software engineer, which is understandable given the website we're posting on, but the post is really focusing on something higher level - the ability to decide whether the problems you're decomposing and the code you're reviewing is for something "good" or "worthwhile" in the first place. Claude could "decompose problems" and "review code" 10x better than it currently does, but if the thing it's making is useless, awkward, or otherwise bad (because of prompts given by people without the qualities in the blog post), it won't matter.


You still need to be able to code to recognize when it's done poorly, and to write the technical specification.


Your approach to programming may be insightful, but professional relationships require mutual trust. You're asking potential employers to invest in your process without having demonstrated its value first.

Even at $15K/year, clients expect predictable results, not just philosophical alignment. Consider starting with smaller deliverables that showcase your abilities while building trust incrementally.

The most successful unconventional developers find ways to translate their unique perspectives into tangible value that others can recognize and measure. Build trust first, then you'll earn the freedom to work in your preferred style.


I’d take a step back and get clarification on the scope of your task. See my comment other day about how you can use Claude to help with that.

https://news.ycombinator.com/item?id=43163011


Example output from your question:

## Core Questions for Migration Scope Clarification

1. *What exactly needs to be preserved?* - Business outcomes only, or exact implementation details? - Current scheduling patterns or can they be optimized?

2. *What's the true scale?* - Number of workflows needing migration - Complexity spectrum of the JavaScript snippets - Frequency and criticality of each workflow

3. *What are the real constraints?* - Timeline requirements - Available expertise (JavaScript, Python, Airflow) - Downtime tolerance during transition

4. *What's the maintenance plan?* - Who will support the migrated workflows? - What documentation needs to be created? - How will knowledge transfer occur?

5. *What's the verification strategy?* - How will you validate correct migration? - What tests currently exist or need to be created? - What defines "successful" migration?

6. *What's unique to your environment?* - Custom integrations with other systems - Special CA Workflow features being utilized - Environmental dependencies

7. *What's the true purpose of this migration?* - Cost reduction, technical debt elimination, feature enhancement? - Part of larger modernization or standalone project? - Strategic importance versus tactical necessity

8. *What approaches have been eliminated and why?* - Complete Python rewrite - Containerized JavaScript execution - Hybrid approaches

9. *What would happen if this migration didn't occur?* - Business impact - Technical debt consequences - Opportunity costs

10. *Who are the true stakeholders?* - Who relies on these workflows? - Who can approve changes to functionality? - Who will determine "success"?

Answering these questions before diving into implementation details will save significant time and reduce the risk of misaligned expectations.


Clear specifications are essential because they create shared understanding through collaborative discussion, preventing misalignment and expensive rework.

With AI-assisted coding, well-defined requirements have become even more crucial as these tools follow instructions precisely but lack business context.

The investment in proper definition isn't wasteful "meta-work" but rather insurance against the much higher cost of rebuilding the wrong solution.


This doesn’t seem to have anything to do with the article.


What you've described perfectly captures the fundamental difference in how many autistic minds approach truth and knowledge compared to neurotypical thinking patterns. This isn't about intelligence but about different operating systems with distinct priorities.

For many of us on the spectrum, uncertainty about factual correctness creates genuine distress. We experience "epistemic anxiety" - that intense need to resolve contradictions and establish what's objectively true. The scientific method becomes a lifeline precisely because it offers a systematic approach to establishing reliable knowledge.

What you observed in your classmates wasn't necessarily indifference to truth but a different relationship with it. Neurotypical social cognition often prioritizes social harmony, identity maintenance, and emotional comfort over factual precision. Being "wrong" for many people triggers social rather than epistemic anxiety.

Some practical advice from my experience:

1. Recognize that for most people, beliefs serve multiple functions beyond accuracy - they signal group membership, maintain self-image, and provide emotional comfort.

2. When sharing information that contradicts someone's view, frame it as an addition rather than a correction: "I recently learned something interesting about this" rather than "Actually, you're wrong."

3. Accept that you cannot make others value epistemic accuracy as intensely as you do. This was one of my hardest lessons.

4. Find your intellectual community. There are others here on hacker news who share your commitment to truth-seeking - they're often in fields like science, philosophy, or engineering.

5. Your heightened concern for factual accuracy is a strength. Many world-changing innovations and discoveries came from minds that couldn't tolerate the cognitive dissonance of an incorrect model.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: