More

manmal · 2026-02-08T19:18:40 1770578320

Models you can run on your own (expensive) computer are just a year behind the SOTA. Linux exists. Why are you so pessimistic?

Dumblydorr · 2026-02-08T20:09:20 1770581360

Typical HN comment. They’re so in the weeds of edge case 1% concerns they can’t see the golden age around them.

Most people living through golden ages might not know it. Many workers in Industrial Revolution saw a decline in relative wages. Many in the Roman Empire were enslaved or impoverished. That doesn’t mean history doesn’t see these as golden ages, where golden age is defined loosely as a broad period of enhanced prosperity and productivity for a group of people.

For all its downsides, pointed out amply above, the golden age of computing started 100 years ago and hasn’t ceased yet.

autoexec · 2026-02-09T00:14:07 1770596047

> Many workers in Industrial Revolution saw a decline in relative wages.

Yeah! Why weren't all those children with mangled limbs more optimistic about the future? Why weren't they singing the praises of the golden age around them? Do you think it would have resulted in a golden age for anyone except a very small few if the people hadn't spoken out against the abuses of the greedy industrialists and robber barons and united against them?

If you can't see what's wrong with what's happening in front of you today and you can't see ahead to what's coming at you in the future you're going to be cut very badly by those "edge cases". Instead of blinding ourselves to them, I'd recommend getting into those weeds now so that we can start pulling them up by their roots.

popalchemist · 2026-02-09T08:41:35 1770626495

The question should be "golden age FOR WHOM?" because the traditional meaning of that phrase implies a society-wide raising of the quality of life. It remains to be seen whether the advent of AI signifies an across-the-board improvement or a furthering of the polarization between the haves and have nots.

account42 · 2026-02-09T10:51:23 1770634283

A gold rush is not the same thing as a golden age.

rsoto2 · 2026-02-10T18:53:58 1770749638

"the golden age of computing started 100 years ago"

Only 14% of Americans described themselves as "very happy" in recent studies, a sharp decline from 31% in 2018.

woohoo we did it, our neighbors are being sent to prison camps who work with the "golden age" bringers. Go team. Nice "golden age" you got there, peasant.

rsoto2 · 2026-02-10T18:51:45 1770749505

So what your saying is lots of people being unemployed and dying from lack of resources is merely a "downside" and we should all just support your mediocre idea of what a "golden age" is?

z500 · 2026-02-09T16:00:08 1770652808

You're right, this right here is the typical HN comment.

manmal · 2026-02-08T18:12:38 1770574358

I‘ve done some phone programming over the Xmas holidays with clawdbot. This does work, BUT you absolutely need demand clearly measurable outcomes of the agent, like a closed feedback loop or comparison with a reference implementation, or perfect score in a simulated environment. Without this, the implementation will be incomplete and likely utter crap.

Even then, the architecture will be horrible unless you chat _a lot_ about it upfront. At some point, it’s easier to just look in the terminal.

manmal · 2026-02-08T17:41:46 1770572506

Apparently, mitochondria and their (non-repairable) DNA play a big role as well.

manmal · 2026-02-07T22:31:12 1770503472

What you’re describing is that we’d turn deterministic engineering into the same march of 9s that FSD and robotics are going through now - but for every single workflow. If you can’t check the code for correctness, and debug it, then your test system must be absolutely perfect and cover every possible outcome. Since that’s not possible for nontrivial software, you’re starting a march of 9s towards 100% correctness of each solution.

That accounting software will need 100M unit tests before you can be certain it covers all your legal requirements. (Hyperbole but you get the idea) Who’s going to verify all those tests? Do you need a reference implementation to compare against?

Making LLM work opaque to inspection is kind of like pasting the outcome of a mathematical proof without any context (which is almost worthless AFAIK).

andrekandre · 2026-02-08T01:45:17 1770515117

  > Who’s going to verify all those tests?

why, the user of course

manmal · 2026-02-06T18:38:12 1770403092

Should have used codex. (jk ofc)

manmal · 2026-02-06T08:00:49 1770364849

> no real increase in final delivry times

That’s not true though. The ability to de-risk concepts within a day instead of weeks will speed up the timeline tremendously.

manmal · 2026-02-06T07:56:41 1770364601

Tell me you haven’t used codex-xhigh without telling me you haven’t used it. It’s bad at overall architecture and big picture. But not at useful abstractions.

manmal · 2026-02-06T05:34:48 1770356088

Opus is really good at bash, and it’s damn fast. Codex is catching up on that front, but it’s still nowhere near. However, Codex is better at coding - full stop.

manmal · 2026-02-05T22:32:17 1770330737

Please no, I don’t need my quick prototypes hardened against every perceivable threat.

comex · 2026-02-06T00:14:00 1770336840

In most cases security is not a matter of adding anything in particular, but a matter of just not making specific types of mistakes.

dimitri-vs · 2026-02-06T01:33:59 1770341639

Maybe I'm being dumb but that reads very contradictory? I would say that security is explicitly a matter of adding particular things.

Ronsenshi · 2026-02-06T03:27:24 1770348444

Not an OP, but seems like you might be talking about different things.

Security could be about not adding certain things/making certain mistakes. Like not adding direct SQL queries with data inserted as part of the query string and instead using bindings or ORM.

If you have insecure raw query that you feed into ORM that you added on top - that's not going to make query more secure.

But on the other hand when you're securing some endpoints in APIs you do add things like authorization, input validation and parsing.

So I think a lot depends on what you mean when you're talking about security.

Security is security - making sure bad things don't happen and in some cases it's different approach in the code, in some cases additions to the code and in some cases removing things from the code.

victorbjorklund · 2026-02-06T07:47:10 1770364030

Is there ever a reason to store passwords in plaintext instead of as a hash? Even in a prototype.

fragmede · 2026-02-06T14:56:07 1770389767

The better question is which LLM is going to make such a basic mistake?

manmal · 2026-02-05T22:27:36 1770330456

I guess humans were involved in all that, so how is that anything but tool use?