Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> We don’t understand LLMs either. We built them, but we can’t explain why they work.

Just because you don't mean no one does. It's a pile of math. Somewhere along the way, something happened to get where we are, but looking at Golden Gate Claude, and the abliteration of shared models, or reading OpenAI's paper about hallucinations, there's a lot of detail and knowledge about how these things works that isn't instantly accessible and readily apparent to everyone on the Internet. As laymen all we can do is black box testing, but there's some really interesting stuff going on to edit the models and get them to talk like pirate.

The human brain is very much an unknowable squishy box because putting probes into it would be harmful to the person who's brain it is we're working on, and we don't like to do that to people because people are irreplaceable. We don't have that problem with LLMs. It's entirely possible to look at the memory register at location x at time y, and correspond that to a particular tensor which corresponds to a particular token which then corresponds to a particular word for us humans to understand. If you want to understand LLMs, start looking! It's an active area of research and is very interesting!



You are missing the ground truth. Humanity does not understand how LLMs work. Every major lab and every serious researcher acknowledges this. What we have built is a machine that functions, but whose inner logic no one can explain.

References like Golden Gate Claude or the latest interpretability projects don’t change that. Those experiments are narrow glimpses into specific activation patterns or training interventions. They give us localized insight, not comprehension of the system as a whole. Knowing how to steer tone or reduce hallucinations does not mean we understand the underlying cognition any more than teaching a parrot new words means we understand language acquisition. These are incremental control levers, not windows into the actual mind of the model.

When we build something like an airplane, no single person understands the entire system, but in aggregate we do. Aerodynamicists, engineers, and computer scientists each master their part, and together their knowledge forms a complete whole. With LLMs, even that collective understanding does not exist. We cannot even fully describe the parts, because the “parts” are billions of distributed parameters interacting in nonlinear ways that no human can intuit or map. There is no subsystem diagram, no modular comprehension. The model’s behavior is not the sum of components we understand, it is the emergent product of relationships we cannot trace.

You said we “know” what is going on. That assumption is patently false. We can see the equations, we can run the training, we can measure activations, but those are shadows, not understanding. The model’s behavior emerges from interactions at a scale that exceeds human analysis.

This is the paradigm shift you have not grasped. For the first time, we are building minds that operate beyond the boundary of human comprehension. It is not a black box to laymen. It is a black box to mankind.

And I say this as someone who directly works on and builds LLMs. The experts who live inside this field understand this uncertainty. The laymen do not. That gap in awareness is exactly why conversations like this go in circles.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: