Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs don't think. At all. They do next token prediction.

If they are conditioned on a large data set that includes lots of examples of the result of people thinking, what they produce will look sort of like the results of thinking, but then if they were conditioned on a large data set of people repeating the same seven knock knock jokes over and over and over in some complex pattern (e.g. every third time, in French), what they produced will look like that, and nothing like thinking.

Failing to recognize this is going to get someone killed, if it hasn't already.



A weights tensor is very similar to a truth table or a LUT in a FPGA, it's just a generalization of it with real numbers instead of booleans.

And then again, would you say that you cannot build a (presumably extremely complex) machine that thinks?

Do you think our brains are not complex biological machines?

Where I agree is that LLMs are absolutely not the endgame. They are super-human litterary prodiges. That's it. Litterary specialists, like poets, writers, scenarists, transcriptors, and so on. We should not ask them anything else.


Are you thinking over every character you type? You are conditioned too by all the info flowing into your head from birth. Does that gauruntee everything your brain says and does is perfect?

People believed in non existent WMDs and tens of thousands got killed. After that what happened ? Chimps with 3 inch brains feel super confident to run orgs and make decisions that effect entire populations and are never held accountable. Ask Snowden what happened after he recognized that.


> LLMs don't think. At all.

How can you so confidently proclaim that? Hinton and Ilya Sutskever certainly seem to think that LLMs do think. I'm not saying that you should accept what they say blindly due to their authority in the field, but their opinions should give your confidence some pause at least.


>> LLMs don't think. At all.

>How can you so confidently proclaim that?

Do you know why they're called 'models' by chance?

They're statistical, weighted models. They use statistical weights to predict the next token.

They don't think. They don't reason. Math, weights, and turtles all the way down. Calling anything an LLM does "thinking" or "reasoning" is incorrect. Calling any of this "AI" is even worse.


If you have an extremely simple theory that debunks the status quo, it is safer to assume there is something wrong with your theory, than to assume you are on to something that no one else figured out.

You are implicitly assuming that no statistical model acting on next-token prediction can, conditional on context, replicate all of the outputs that a human would give. This is a provably false claim, mathematically speaking, as human output under these assumptions would satisfy the conditions of Kolmogorov existence.


Sure.

However, the status quo is that "AI" doesn't exist, computers only ever do exactly what they are programmed to do, and "thinking/reasoning" wasn't on the table.

I am not the one that needs to disprove the status quo.


No, the status quo is that we really do not know. You made a claim why it is impossible for LLMs to think on the grounds that they are statistical models, so I disproved your claim.

If it really was that simple to dismiss the possibility of "AI", no one would be worried about it.


I never said it was impossible. Re-read it, and kindly stop putting words in my mouth. :)


But is the connection of neurons in our brains any more than a statistical model implemented with cells rather than silicon?


You're forgetting the power of the divine ineffable human soul, which turns fatty bags of electrolytes from statistical predictors into the holy spirit.


An LLM is very much like a CPU. It takes inputs and performs processing on them based on its working memory and previous inputs and outputs, and then produces a new output and updates its working memory. It then loops back to do the same thing again and produce more outputs.

Sure, they were evolved using criteria based on next token prediction. But you were also evolved, only using critera for higher reproduction.

So are you really thinking, or just trying to reproduce?


Do you think Hinton and Ilya haven’t heard these arguments?


I hate to be that guy, but this is (a) little to do with the actual problem at hand in the article, and (b) a dramatic oversimplification of the real challenges with LLMs.

> LLMs don't think. At all. They do next token prediction.

This is very often repeated by non-experts as a way to dismiss the capabilities of LLMs as some kind of a mirage. It would be so convenient if it were true. You have to define what 'think' means; once you do, you will find it more difficult to make such a statement. If you consider 'think' to be developing an internal representation of the query, drawing connections to other related concepts, and then checking your own answer, then there is significant empirical evidence to support high-performing LLMs do the first two, and one can make a good argument that test-time inference does a half-adequate, albeit inefficient, version of the latter. Whether LLMs will achieve human-level efficiency with these three things is another question entirely.

> If they are conditioned on a large data set of people repeating the same seven knock knock jokes over and over and over in some complex pattern (e.g. every third time, in French), what they produced will look like that, and nothing like thinking.

Absolutely, but this has little to do with your claim. If you narrow the data distribution, the model cannot develop appropriate language embeddings to do much of anything. You could even prove this mathematically with high probability statements.

> Failing to recognize this is going to get someone killed, if it hasn't already.

The real problem as in the article is that the LLM failed to intuit context, or to ask a followup. While a doctor would never have made this mistake, the doctor would know the relevant context since the patient came to see them in the first place. If you had a generic knowledgeable human acting as a resource bank that was asked the same question AND requested to provide nothing irrelevant, I can see a similar response being made. To me, the bigger issue is that there are consequences to easy access to esoteric information for the general public, and this would be reflected more in how we perform reinforcement learning to assert LLM behavior.


yeah sure, but, did it enrich the shareholders?


I'm not sure humans are any different;

Humans don't think. At all. They do next token prediction.

If they are [raised in environments] that includes lots of examples of the result of people thinking, what they produce will look sort of like the results of people thinking, but then if they were [raised in an environment] of people repeating the same seven knock knock jokes over and over and over in some complex pattern (e.g. every third time, in French), what they produced will look like that, and nothing like thinking.

I believe this can be observed in examples of feral children and accidental social isolation in childhood. It also explains the slow start but nearly exponential growth of knowledge within the history of human civilization.


That’s…completely incorrect.

I’m not going to hash out childhood development here because I’m not paid to post but if anyone read the above and was even slightly convinced I implore you to go read up on even the basics of early childhood development.


> I implore you to go read up on even the basics of early childhood development.

That's kind of like taking driving lessons in order to fix an engine. 'Early childhood development' is an emergent property of what could be cumulatively called a data set (everything the child has been exposed to).


https://en.wikipedia.org/wiki/Early_childhood_development

Please read up on what the term means before claiming that it is about external influences.


No. It’s not.

ECD includes the mechanisms by which children naturally explore the world and grow.

I’m going to give you a spoiler and tell you that children are wired to explore and attempt to reason from birth.

So to fix your analogy, you reading about ECD is like you learning what an engine is before you tell a room full of people about what it does.


The neurons in a child's brain might be 'wired' to accept data sets, but that does not make them fundamentally different from AI systems.

Are you claiming that a child who is not exposed to 'reason' will reason as well and one who is? Or a child who is not exposed to 'math' will spontaneously write a proof? Or a child not exposed to English will just start speaking it?

01101100 01100101 01100001 01110010 01101110 may be baked into US and AI in different ways but it is fundamentally the same goal and our results are similarly emergent from the process.


Sure, but you can hold humans liable for their advice. Somehow I doubt this will be allowed to happen with chatbots.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: