More

khafra · 2026-02-03T06:37:28 1770100648

They're the last living humans, and the last human-derived mind?

I like Cordwainer Smith and Peter Watts; so I really liked this blend of their styles and subjects.

Tarq0n · 2026-02-03T19:18:05 1770146285

I adore Peter Watts, so I'll be checking out Smith then!

Watts is like brain candy, keeps my mind buzzing from all the ideas for weeks. Charles Stross can have the same effect, a sort of future shock.

khafra · 2026-02-04T08:46:19 1770194779

Cordwainer Smith's style and subject matter is quite different from Watts; I felt this story was like a combination of the two. So, if you would like this story even if it was less eschatalogically cynical, and had more of a golden age setting, you'll probably like Smith!

khafra · 2026-01-28T09:16:48 1769591808

> undo market influence

Pointless nitpick, but you want "undue market influence." "Undo market influence" is what the FTC orders when they decide there's monopolistic practices going on.

dalmo3 · 2026-01-28T09:48:15 1769593695

Not pointless. I had no idea what the original wording meant.

khafra · 2026-01-12T05:58:10 1768197490

> Non-trivial coding tasks

A coding agent just beat every human in the AtCoder Heuristic optimization contest. It also beat the solution that the production team for the contest put together. https://sakana.ai/ahc058/

It's not enterprise-grade software, but it's not a CRUD app with thousands of examples in github, either.

tete · 2026-01-12T09:48:43 1768211323

> AtCoder Heuristic optimization contest

Optimization space that has been automated before LLMs. Big surprise, machines are still better at this.

This feels a bit like comparing programming teams to automated fuzzing.

In fact not too rarely developing algorithms involved some kind of automated algorithm testing where the algorithm is permuted in an automatic manner.

It's also a bit like how OCR and a couple of other fields (protein folding) are better to be done in an automated manner.

The fact that now this is done by an LLM, another machine isn't exactly surprising. Nobody claims that computers aren't good at these kinds of tasks.

fmbb · 2026-01-12T07:10:36 1768201836

> It's not enterprise-grade software, but it's not a CRUD app with thousands of examples in github, either.

Optimization is a very simple problem though.

Maintaining a random CRUD app from some startup is harder work.

lazyasciiart · 2026-01-12T08:20:28 1768206028

The argument was about “non-trivial”. Are you calling this work trivial or not?

matwood · 2026-01-12T07:57:32 1768204652

> Optimization is a very simple problem though.

C'mon, there's post every other week that optimization never happens anymore because it's too hard. If AI can take all the crap code humans are writing and make it better, that sounds like a huge win.

nothrabannosir · 2026-01-12T08:10:31 1768205431

Simple is the opposite of complex; the opposite of hard is easy. They are orthogonal. Chess is simple and hard. Go is simpler and harder than chess.

Program optimization problems are less simple than both, but still simpler than free-form CRUD apps with fuzzy, open ended acceptance criteria. It would stand to reason an autonomous agent would do well at mathematically challenging problems with bounded search space and automatically testable and quantifiable output.

(Not GP but I assume that's what they were getting at)

dns_snek · 2026-01-12T13:04:58 1768223098

> If AI can take all the crap code humans are writing and make it better, that sounds like a huge win.

This sort of misunderstanding of achievements is what keeps driving the AI mania. The AI generated an algorithm for optimizing a well-defined, bounded mathematical problem that marginally beat the human-written algorithms.

This AI can't do what you're hyping it up to do because software optimization is a different kind of optimization problem - it's complex, underspecified, and it doesn't have general algorithmic solutions.

LLM may play a significant role in optimizing software some day but it's not going to have much in common with optimization in a mathematical sense so this achievement doesn't get us any closer to that goal.

PunchyHamster · 2026-01-12T13:32:16 1768224736

Compilers beat most coders before LLM were even popular

tripzilch · 2026-01-12T11:26:14 1768217174

had to scroll far to find the problem description

> AHC058, held on December 14, 2025, was conducted over a 4-hour competition window. The problem involved a setting where participants could produce machines with hierarchical relationships, such as multiple types of “apple-producing machines” and “machines that build those machines.” The objective was to construct an efficient production planning algorithm by determining which types and hierarchies of machines to upgrade and in what specific order.

... so not a CRUD app but it beat humans at Cookie Clicker? :-)

khafra · 2025-12-12T06:12:16 1765519936

LLMs are an especially tough case, because the field of AI had to spend sixty years telling people that real AI was nothing like what you saw in the comics and movies; and now we have real AI that presents pretty much exactly like what you used to see in the comics and movies.

xwolfi · 2025-12-12T09:38:25 1765532305

But it cannot think or mean anything, it's just a clever parrot so it's a bit weird. I guess uncanny is the word. I use it as google now, like just to search stuff that are hard to express with keywords.

adventured · 2025-12-12T16:56:42 1765558602

99% of humans are mimics, they contribute essentially zero original thought across 75 years. Mimicry is more often an ideal optimization of nature (of which an LLM is part) rather than a flaw. Most of what you'll ever want an LLM to do is to be a highly effective parrot, not an original thinker. Origination as a process is extraordinarily expensive and wasteful (see: entrepreneurial failure rates).

How often do you need original thought from an LLM versus parrot thought? The extreme majority of all use cases globally will only ever need a parrot.

robocat · 2025-12-12T21:38:33 1765575513

> clever parrot

Is it irony that you duckspeak this term? Are you a stochastically clever monkey to avoid using the standard cliche?

The thing I find most educating about AI is that it unfortunately mimics the standard of thinking of many humans...

LEDThereBeLight · 2025-12-12T16:26:05 1765556765

Try asking it a question you know has never been asked before. Is it parroting?

khafra · 2025-12-11T06:15:11 1765433711

You can't anonymize comments from well-known users, to an LLM: https://gwern.net/doc/statistics/stylometry/truesight/index

WithinReason · 2025-12-11T07:55:47 1765439747

That's an overly strong claim, an LLM could also be used to normalise style

wetpaws · 2025-12-11T12:36:35 1765456595

How would you possibly grade comments if you change them?

strken · 2025-12-11T13:10:18 1765458618

Extract the concrete predictions, evaluate them as true/false/indeterminate, and grade the user on the number of true vs false?

Natsu · 2025-12-11T17:11:07 1765473067

This doesn't even seem to look at "predictions" if you dig into what it actually did. Looking at my own example (#210 on https://karpathy.ai/hncapsule/hall-of-fame.html with 4 comments), very little of what I said could be construed as "predictions" at all.

I got an A for commenting on DF saying that I had not personally seen save corruption and listing weird bugs. It's true that weird bugs have long been a defining feature of DF, but I didn't predict it would remain that way or say that save corruption would never be a big thing, just that I hadn't personally seen it.

Another A for a comment on Google wallet just pointing out that users are already bad at knowing what links to trust. Sure, that's still true (and probably will remain true until something fundamental changes), but it was at best half a prediction as it wasn't forward looking.

Then something on hospital airships from the 1930s. I pointed out that one could escape pollution, I never said I thought it would be a big thing. Airships haven't really ever been much of a thing, except in fiction. Maybe that could change someday, but I kinda doubt it.

Then lastly there was the design patent famously referred to as the "rounded corner" patent. It dings me for simplifying it to that label, despite my actual statements being that yes, there's more, but just minor details like that can be sufficient for infringement. But the LLM says I'm right about ties to the Samsung case and still oversimplifying it. Either way, none of this was really a prediction to begin with.

koakuma-chan · 2025-12-11T13:10:44 1765458644

You don’t need comments, just facts in them to see if they’re accurate.

khafra · 2025-12-05T14:00:54 1764943254

The natural solution is futarchy: Vote on values, bet on beliefs. Everybody knows that, all else being equal, they want higher GDP/cap, better GINI, a higher happiness index. Only the experts know whether tariffs will help produce this.

So, instead of having everyone vote on tariffs (or vote for a whimsical strongman who will implement tariffs), have everyone vote for the package of metrics they want to hit. Then, let experts propose policy packages to achieve these metrics, and let everyone vote on which policies will achieve the goals.

Bullshit gets heavily taxed, and the beliefs of people who actually know the likely outcomes will be what guide the nation.

khafra · 2025-12-03T06:51:05 1764744665

If you start out as a non-profit, and pull a bunch of shady shenanigans in order to convert to a for-profit, claiming to be ethical after that is a bit of a hard sell.

khafra · 2025-12-03T06:43:12 1764744192

This piece is wildly optimistic about the outcomes likely from AI on par with the smartest humans, let alone smarter than that. The author seems to think that widespread disbelief in the legitimacy of the system could make a difference, in such a world.

khafra · 2025-11-25T06:15:52 1764051352

All of the leading labs are on track to kill everyone, even Anthropic. Unlike the other labs, Anthropic takes reasonable precautions, and strives for reasonable transparency when it doesn't conflict with their precautions; which is wholly inadequate for the danger and will get everyone killed. But if reality graded on a curve, Anthropic would be a solid B+ to A-.

khafra · 2025-11-18T12:53:58 1763470438

Shared-keyboard OMF 2097 also had an overwhelming advantage for the first mover, since most keyboards had 2-3 key rollover--if you hit wd to jump forward, your opponent had to be fast to do anything before you hit your attack key.

ukuina · 2025-11-18T21:02:24 1763499744

I wonder if OpenOMF has the same limits.

c-hendricks · 2025-11-18T21:04:49 1763499889

It's a keyboard thing and less of a software thing.