The implication that any software is "mysterious" is problematic - there is no "...

margalabargala · 2025-03-17T19:17:05 1742239025

You're misunderstanding. A level of abstraction is necessary for operation of modern systems. There is no human alive who, given an intermediate step in the middle of some running learning algorithm, is able to understand and mentally model the full system at full man-made resolution, that is, down to the transistor level, on a modern CPU. Someone wishing to understand a piece of software in 2025 is forced to, at some point, accept that something somewhere "does what it says on the tin" and model it thusly rather than having a full understanding.

EncomLab · 2025-03-17T20:13:57 1742242437

It's not misunderstanding at all - but your response is certainly an attempt to obfuscate the point being made. The moment you represent anything in code, you are abstracting a real thing into it's digital representation. That digital representation if fully formed at every cycle of the digital system processing it, and the state of the system - all the way down to the transistor level may be precisely determined. To say otherwise is to make the same error as those who claim that consciousness or understanding are indefinable "extra-ordinary" things that we have to just accept exist without any justification or evidence.

margalabargala · 2025-03-17T22:20:40 1742250040

Okay, then, you're just using your own personal definition of "black box" instead of the one everyone else uses.

Something that's a black box is unknown to the speaker. It's not understood to be unknowable to anyone.

EncomLab · 2025-03-17T23:14:46 1742253286

So your claim is that there are instructions, data, or both that are unable to be determined in what, is by definition, a fully deterministic machine?

margalabargala · 2025-03-18T00:06:41 1742256401

By an individual person, yes. I claim that there exists no single human capable of fully understanding the totality of the software and hardware down to the individual transistor level.

arkadiytehgraet · 2025-03-21T21:52:37 1742593957

That's a very wrong statement. Pretty sure I could explain all the maths, all the physics, all the electronics, all the operating systems and all the user space of a single high level language operation, when I was a fresh graduate. Now, I have forgotten most of the physics and electronics, since the university was quite some time ago, but feel free to ask any decent student of an IT bachelor, they should be able to pretty much build the PC from scratch. Sure, modern processors and whatnot add a bunch of optimizations, but you seem to really overstate the complexity of the computer.

margalabargala · 2025-03-22T02:13:54 1742609634

We're talking about two separate things.

I'm talking about understanding, fully, the state of the CPU. Not just the conceptual operation of the CPU. Like, given a specific, modern AMD or Intel CPU, understand fully all states of all transistors.

EncomLab · 2025-03-18T13:05:24 1742303124

I agree and never claimed that "a single person" could - but just because something is too complex for a single person to fully understand does not make it "mysterious" or a "black box". So what is the claim you are making? Anything beyond the complexity of a single person to understand = magic?

margalabargala · 2025-03-18T14:21:37 1742307697

We're just using different definitions for "black box".

My definition is that it's something unknown, yours is that it's something unknowable.

xmprt · 2025-03-17T19:31:26 1742239886

The mystery was never in the "how do computers calculate the probabilities of next tokens" but rather in the "why is it able to work so well" and "what does this individual neuron contribute to the whole model"

UltraSane · 2025-03-18T00:53:03 1742259183

The mystery is in how the data is encoded in the parameters and why LLMs performance scales so well with parameters. The key seems to be almost orthogonal vectors that allow neural networks to store so much data. They allow 2^(cn) vectors to be learned in an n-dimensional space with c being a constant.Since almost orthogonal vectors have very small dot products, they minimally interfere with each other, allowing many concepts to coexist with limited cross talk which enables superposition

tlb · 2025-03-18T08:43:59 1742287439

I don't know any serious programmer who thinks that, just because each operation is simple, the operation of the whole thing can't be mysterious.

01HNNWZ0MV43FF · 2025-03-17T19:26:01 1742239561

But the weights trained from machine learning are a black box, in the sense that no human designed e.g. the image processing kernels that those weights represent.

That is one reason people are skeptical of them, not only is training a large model at home expensive, not only is the data too big to trivially store, but the weights are not trivial to debug either