Hacker Newsnew | past | comments | ask | show | jobs | submit | the_mitsuhiko's commentslogin

I'm going to shill my own writing here [1] but I think it addresses this post in a different way. Because we can now write code so much faster and quicker, everything downstream from that is just not ready for it. Right now we might have to slow down, but medium and long term we need to figure out how to build systems in a way that it can keep up with this increased influx of code.

> The challenge is to develop new personal and organizational habits that respond to the affordances and opportunities of agentic engineering.

I don't think it's the habits that need to change, it's everything. From how accountability works, to how code needs to be structured, to how languages should work. If we want to keep shipping at this speed, no stone can be left unturned.

[1]: https://lucumr.pocoo.org/2026/2/13/the-final-bottleneck/


I don’t think we can expect all workers at all companies to just adopt a new way of working. That’s not how competition works.

If agentic AI is a good idea and if it increases productivity we should expect to see some startup blowing everyone out of the water. I think we should be seeing it now if it makes you say ten times more productive. A lot of startups have had a year of agentic AI now to help them beat their competitors.


We're already seeing eye-watering, blistering growth from the new hot applied AI startups and labs

Imo the wave of top down 'AI mandates' from incumbent companies is a direct result of the competitive pressure, although it probably wont work as well as the execs think it will

that being said even Dario claims a 5-20% speedup from coding agents, 10x productivity only exists in microcosm prototypes, or if someone was so unskilled oneshotting a localhost web app is a 10x for them


"eye-watering, blistering growth from the new hot applied AI startups and labs"

Could you give us a few examples?


Claude Cowork was apparently built in less than two weeks using Claude Code, and appears to be getting significant usage already.

Only a personal anecdote, but the humans I know that have used it are all aware of how buggy it is. It feels like it was made in 2 weeks.

Which gets back to the outsourcing argument: it’s always been cheap to make buggy code. If we were able to solve this, outsourcing would have been ubiquitous. Maybe LLMs change the calculus here too?


That's certainly a good example of a tool developed quickly thanks to AI assistance.

But coding assistance tools must themselves be evaluated by what they produce. We won't see significant economic growth through using AI tools to build other AI tools recursively unless the there are companies using these tools to make enough money to justify the whole stack.

I believe there are teams out there producing software that people are willing to pay for faster than they did before. But if we were on the verge of rapid economic growth, I would expect HN commenters to be able to rattle these off by the dozen.


claude code 1B+ arr

ant 10xing ARR, oai

harvey legora sierra decagon 11labs glean(ish) base10(infra) modal(infra) gamma mercor(ish) parloa cognition

regulated industries giving these companies 7/8-fig contracts less than 2 years from incorporation


AI has been a lifesaver for my low performing coworkers. They’re still heavily reliant on reviews, but their output is up. One of the lowest output guys I ever worked with is a massive LinkedIn LLM promoter.

Not sure how long it’ll last though. With the time I spend on reviews I could have done it myself, so if they don’t start learning…


> With the time I spend on reviews I could have done it myself, so if they don’t start learning…

Then? Your job is still to review their code. If they are your coworker, you can not fire them.


Then just start rubber-stamping their code. Say you "vibe" read it.

OpenClaw went from first commit in late November to Super Bowl commercial (it's meant to be the tech behind that AI.com vaporware thing) in February.

(Whether you think OpenClaw is good software is kind of beside the point.)


OpenClaw is not going to be a thing in 6 months. The core idea might exist but that codebase is built on a house of cards and is being replicated in 10% of the code.

I don’t think anyone is arguing against code agents being good at prototypes, which is a great feat, but most SWE work is built on maintaining code over time.


It’s very much not beside the point. Productivity is measured in how much value you get out from the hours your workers put in.

But that only gets you to a philosophical argument about what "value" is. Many would argue that being able to get your thing into a Super Bowl commercial is extremely valuable. I definitely have never built anything that did.

It's very much imperfect, but the only consistently agreed upon and useful definition of "value" we have in the West is monetary value, and in that sense, we have at least a few major examples of AI generating value rapidly.


OK but that also means VR was a success, and web 3, and NFTs.

Well, yes, these were definitely a success for some. And I personally still believe that VR will be a success in the longer-term.

In any case, I agree with the grandparent post about the distinction between being successful and good.


One of the most interesting aspects is when LLMs are cheap and small enough so that apps can ship with a builtin one so that it can adjust code for each user based on input/usage patterns.

The clear intent is to stop allowing regular people to be able to compute...anything. Instead, you'll be given a screen that only connects to $LLM_SERVER and the only interface will be voice/text in which you ask it to do things. It then does those things non-deterministically, and slower than they would be done right now. But at least you won't have control over how it works!

Weather or not the intent is as nefarious as you suggest, that type of UI is going to be a boon for a lot of people. Most people on the planet are incredibly computer illiterate.

If this could ever happen, there will be no point in GUI apps anymore, your AI assistant or what have you will just interact with everything on your behalf and/or present you with some kind of master interface for everything.

I don't see a bunch of small agents in the future, instead just one per device or user. Maybe there will be a fleeting moment for GUI/local apps to tie into some local, OS LLM library (or some kind of WebLLM spec) to leverage this local agent in your app.


>If this could ever happen, there will be no point in GUI apps anymore, your AI assistant or what have you will just interact with everything on your behalf and/or present you with some kind of master interface for everything.

sort of how the hammer is the most useful tool ever and all we have to do is to make every thing that needs doing look like a nail.


Agents will still have to communicate with each other, the communication protocols, how data is stored, presented and queried will be important for us to decide?

Will we stop using web browsers as we understand them today in the next few decades in favor of only interacting with agents? Maybe.


I've heard this referenced multiple times and I have yet to hear the value be clearly articulated. Are you saying that every user would eventually be using a different app? Wouldn't it eventually get to the point that negates the need for the app developer anyways since you would eventually be unable to offer any kind of support, or are we just talking design changing while the actual functionality stays the same? How would something like this actually behave in reality?

I don't know!

These are valid points, taken to the extreme we will have apps that cannot be supported.

In short term, we already have SQL/reports being automated. Lovable etc is experimenting with generating user interfaces from prompts, soon we will have complete working apps from a prompt. Why not have one core that you can expand via a prompt?

I am currently studying and depending heavily on Anki, its been amazing to use Claude Code to add new functionality on the fly. Its a holy mess of inconsistent/broken UX but it so clearly gives me value over the core version. Sometimes it breaks, but CC can usually fix it within a prompt or two.


> I've heard this referenced multiple times and I have yet to hear the value be clearly articulated.

Me too, and I see this as _incredibly_ wasteful.


LISP returns!

>but medium and long term we need to figure out how to build systems in a way that it can keep up with this increased influx of code.

Why? Why do we need to "write code so much faster and quicker" to the point we saturate systems downstream? I understand that we can, but just because we can, does'nt mean we should.


> to the point we saturate systems downstream

But that's point of TFA, no? Now that writing code is no longer the bottleneck, the upstream and downstream processes have become the new bottlenecks, and we need to figure out how to widen them.

As I see it, the end goal for all of this is generating software at the speed of thought, or at least at the speed of speech. I want the digital butler to whom I could just say - "I'm not happy with the way things happened to day, please change it so that from here on, it'll be like x" - and it'll just respond with "As you wish", and I'll have confidence that it knows me well enough and is capable enough to have actually implemented the best possible interpretation of what I asked for, and that the few miscommunications that do occur would be easy to fix.

We're obviously not close that yet, but why shouldn't we build towards it?


> Now that writing code is no longer the bottleneck

I think it’s contestable that writing the code was ever the main bottleneck.

> As I see it, the end goal for all of this is generating software at the speed of thought, or at least at the speed of speech.

The question is what distinguishes that from having AGI, and if the answer is “nothing”, then that will change the whole game entirely again.


Oh, absolutely, my vision depends on AGI (and maybe even ASI), and I definitely agree that it'll be a whole new ball game.

If we want to continue to ship at that speed we will have to. I’m not sure if we should, but seemingly we are. And it causes a lot of problems right now downstream.

We were already rushing and churning products and code of inferior quality before AI (let's e.g. consider the sorry state of macOS and Windows in the past decade).

Using AI to ship more and more code faster, instead of to make code more mature, will make this worse.


I want to use AI to ship more and more code faster and better. If AI means our product quality goes down we should figure out better ways to use it.

Shouldn't you want to ship less code that does more? Since when was LoC the relevant benchmark for engineering?

Less code isn't as important as it used to be, because the cost of maintaining (simple) code has gone down as well.

With coding agent projects I find that investing in DRY doesn't really help very much. Needing to apply the same fix in two places is a waste of time as a human. An agent will spot both places with grep and update them almost as fast as if there was just one.

It's another case where my existing programming instincts appear to not hold as well as I would expect them to.


When you talk about maintaining code, do you mean having the LLM do it and you maintain a write-only codebase? Because if you're reading the code yourself and you have a bloated tangled codebase it would make things much harder right?

Is the goal basically a codebase where your interactions are mediated through an LLM?


I'm betting on it meaning the product quality going down - and technical debt increasing, which will be dealt with more AI in a downward spiral. Meanwhile college CS majors wont ever bother learning the basics (as AI will handle their coursework, and even their hobby work). Then future AI will train on previous AI output, with the degredation that brings...

I was having this conversation at work, where if the promise of AI coding becomes true and we see it in delivery speed, we would need to significantly increase the throughput of all other aspects of the business.

Totally agree - that's what I was trying to get at with "organizational habits". The way we plan, organize and deliver software projects is going to radically change.

I'm not ready to write about how radically though because I don't know myself!


> If we want to keep shipping at this speed

Do we? Spewing features like explosive diarrhea is not something I want.


The linked article is worth reading alongside this one.

The thing I'd add from running agents in actual production (not demos, but workflows executing unattended for weeks): the hard part isn't code volume or token cost. It's state continuity.

Agents hallucinate their own history. Past ~50-60 turns in a long-running loop, even with large context windows, they start underweighting earlier information and re-solving already-solved problems. File-based memory with explicit retrieval ends up being more reliable than in-context stuffing - less elegant but more predictable across longer runs.

Second hard part: failure isolation. If an agent workflow errors at step 7 of 12, you want to resume from step 6, not restart from zero. Most frameworks treat this as an afterthought. Checkpoint-and-resume with idempotent steps is dramatically more operationally stable.

Agree it's not just habits - the infrastructure mental model has to change too. You're not writing programs so much as engineering reliability scaffolding around code that gets regenerated anyway.


Rust has cxx which I would argue is "good enough" for most use cases. At least all C++ use cases I have. Not perfect, but pretty damn reasonable.

Why would 200 contributors have to be okay with this migration? The project has a leader, the leader makes decisions.

because let's say 150 contributors might not be okay with the decision and leave. hard to lead from the front if there is nobody behind to be lead.

To be fair, any of them who didn't leave in the last few controversies probably won't leave over this.

"I can excuse foaming over pronouns and master branch, but I draw a line at using rust" Would not be surprised by that opinion.

In case you are referring to the picture: that was taken in Vienna at ClawCon.

0.16 changes things around dramatically.

Docs on this?

Here[1]. This mentions async, but it affects every single use of IO functions.

[1] https://kristoff.it/blog/zig-new-async-io/


Ah. Yeah.

My main Zig project will need over 1,000 edits to get it up there :O I've already had Claude spec out the changes required. I'll just have it or Codex or whatever fork itself into one agent per file and bang on it (for the stuff I can't regex myself) ;)

But the IO thing is frankly a good idea. I/O dependency injection, when I've used it in the past, has made testing quite a bit simpler (such as routing any I/O stream into a string to assert on) and the code much easier to reason about. The extra argument is a bit annoying, but that's the price of purity and it's worth it.


Also anything that reads environment variables.

I’m a casual user and the 0.16 changes scare me. I tried multiple attempts now, even with full LLM support to upgrade and the result is just a) painful and b) not very good. I have high doubts that the current IO system of 0.16 will make it for another release given the consequences of their choices.

Here's some advice:

1. if you're a casual user (ie you don't follow the development) don't try incomplete APIs that not even the creators of fully know how they are supposed to work (because they're still tinkering with them) also you can't expect docs until the design is somewhat finalized (which is not yet, fyi)

2. llms don't help when trying to make sense of the above (a feature that is not complete, that has no docs other than some hints in commit messages, that changes every other commit), reserve llms for when things are stable and well documented, otherwise they will just confuse you further.

If you want to try new features before they land in a tagged release, you must engage with the development process at the very least.


> if you're a casual user (ie you don't follow the development) don't try incomplete APIs that not even the creators of fully know how they are supposed to work

Is the completeness of each API formally documented anywhere? Maybe I missed something but it doesn't seem like it is, in which case the only way to know would be to follow what's happening behind the scenes.


Zig cuts releases. This API is not on a release of Zig yet. It's only available through nightly builds of master. "Casual users" should stick to releases if they don't want to deal with incomplete APIs.

That's not really the issue. The stable API is incompatible with the API that will launch with 0.16. It's not really relevant if I'm playing with incompletely API, I want to know how I can migrate to it. I did not move yet to 0.16, but I wanted to see.

The migration pain will be the same once it launches unless they revert back, which does not seem likely at all.

But the point is: potentially every API is unstable.


> if you're a casual user (ie you don't follow the development) don't try incomplete APIs that not even the creators of fully know how they are supposed to work

From what I can tell pretty much everything can be broken at any point in time. So really the only actual advise here is not to use the language at all which is not reasonable.

> llms don't help when trying to make sense of the above

That has not been my experience. LLMs were what enabled me to upgrade to 0.16 experimentally at all.

> If you want to try new features before they land in a tagged release, you must engage with the development process at the very least.

No, that is unnecessary gatekeeping. 0.16 will become stable at one point and I don't want to wait until then to figure out what will happen. That's not how I used Rust when it was early (I always also tried nightlies) and that line of thinking just generally does not make any sense to me.

The reality is that Zig has very little desire to stabilize at the moment.


The flipside of that is that the incomplete API should be in a separate branch until it is ready to be included in a release, so that people can opt in instead of having to keep in mind what parts of the API they aren't supposed to be using. It doesn't seem like you expect the changes to be finalised in time for 0.16.

Europeans are increasingly associating with Europe over an individual country. US citizens have been doing this for a long time. They are citizens of the US before the state they reside in.

Where is this data from?

EUROSTAT. They are running regular surveys. From when they stated to today, the affiliation with Europe went up. Some somewhat recent data is here: https://europa.eu/eurobarometer/surveys/detail/2971

Given the dramatic amount of in EU migration that is not too surprising.


This data seems to suggest something completely different than that people are associating as citizens of EU instead of citizens of their respective countries.

93% of respondents are correct in that you become an EU citizen automatically by living in an EU country, and 87% "feel" that they're EU citizens.


I am not saying that European citizens are saying they are saying they are European over their nationality, but that the trend towards a European identity is going up over time.

Exactly. Message-ID is not required.

An unrelated frustration of mine is that Message-ID really should not be overridden but SES for instance throws away your Message-ID and replaces it with another one :(


It is de-facto required and has been for many years.

Should in most RFCs also mean "do it as long as you don't have a very good technical reason not to do it". Like it's most times a "weak must". And in that case the only reason it isn't must is for backward compatibility with older mail system not used for sending automated mails.

And it is documented if you read any larger mail providers docs about "what to do that my automated mails don't get misclassified as spam". And spam rejection is a whole additional non-standardized layer on top of RFCs anyone working with mail should be aware of. In any decades old non centralized communication system without ever green standards having other "industry standard/de-factor" but not "standardized" requirements is pretty normal btw.


What's de-facto required and not is completely irrelevant here. The claim was that "Message-ID header [is a] a requirement that has been part of the Internet Message Format specification ... since 2008" (emphasis mine), which is false; it's a recommendation that has been part of the Internet Message Format specification since 2008

I'm not working at this company but I found that these types of problems can often be simplified in the architecture.

> Most meetings start on the hour, some on the half, but most on the full. It sounds obvious to say it aloud, but the implication of this has rippled through our entire media processing infrastructure.

When you can control when it happens, you can often jitter things. For instance the naive approach of rate limiting users down to quantized times (eg: the minute, the hour, etc.) leads to every client coming back at the same time. The solution there is to apply a stable jitter so different clients get different resets.

That pattern does not go all that well with meetings as they need to happen when they happen, which is going to be mostly the hour and 30 minutes etc. However often the lead up time to those meetings is quite long, so you can do the work needed that should happen on the hour, quite a bit ahead of time and then apply the changes in one large batch on the minute.

You have similar problems quite often with things like weekly update emails. At scale it can take you a lot of time to prepare all the updates, often more than 12 hours. But you don't want the mails to come in at different times of the day so you really need to get the reports prepared and then send them out when ready.


They mention that they implemented jitter later in the post.


But my reading of their jitter is a very narrow one for the actual connection to the database. They are still doing most of the work on the minute.


They changed it. It was singular.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: