In contrast, getting through Hyperion was hard for me (some of the character stories I LOVED and some felt like a slog), but I really loved Fall of Hyperion.
> POC is fully conscious according to any test I can think of, we have full AGI
There are no tests for consciousness. Consciousness resides fully as a first person perspective and can't be inspected or detected from the outside (at least not in any way currently known to science or philosophy). What they mean when they say that is "my brain is interpreting this thing as conscious, so I am accepting that".
Maybe LLMs are conscious in some abstract way we don't understand. I doubt it, but there's no way to tell. And an AI claiming that it IS or is NOT conscious is not evidence of either conclusion.
If there is some level of consciousness, it's in a weird way that only becomes instantiated in the brief period while the model is predicting tokens, and would be highly different from human consciousness.
Yeah, a sci-fi analogy might be one where you keep getting cloned with all of your memories intact and then vaporized shortly after. Each instantiation of "you" feels a continuous existence, but it's an illusion.
(Some might argue that's basically the human experience anyway, in the Buddhist non self perspective - you're constantly changing and being reified in each moment, it's not actually continuous)
Or simply be constantly hibernated and de-hibernated. Or, if your brain is simulated, the time between the ticks.
My mental image, though, is that LLMs do have an internal state that is longer lived than token prediction. The prompt determines it entirely, but adding tokens to the prompt only modifies it slightly- so in fact it's a continuously evolving "mental state" influenced by a feedback loop that (unfortunately) has to pass through language.
With LLM's their internal state is their training + system prompt + context. Most chatbot UIs hide the context management. But if you take an existing conversation and replace a term in the context with another grammatically (and semantically) similar term then send that the LLM will adjust its output to that new "history".
It will have no conception or memory of the alternate line of discussion with the previous term. It only "knows" what is contained in the current combination of training + system prompt + context.
If you change the LLM's personal from "Sam" to "Alex" in the LLM's conception of the world it's always been "Alex". It will have no memory of ever being "Sam".
Yes, as I said the prompt (the entire history of the conversation, including vendor prompting that the user can't see) entirely determines the internal state according to the LLM's weights. But the fact that at each new token the prediction starts from scratch doesn't mean that the new internal state is very different from the previous one. A state that represents the general meaning of the conversation and where the sentence is going will not be influenced much by a new token appended to the end. So the internal state "persists" and transitions smoothly even if it is destroyed and recreated from scratch at each prediction.
The state "persists" as the context. There's no more than the current context. If you dumped the context to disk, zeroed out all VRAM, then reloaded the LLM, and then fed that context back in you'd have the same state as if you'd never reloaded anything.
Nothing is persisted in the LLM itself (weights, layer, etc) nor in the hardware (modulo token caching or other scaling mechanisms). In fact this happens all the time with the big inference providers. Two sessions of a chat will rarely (if ever) execute on the same hardware.
Yes, you're repeating once again the same concept. We know it. What I am saying is that since the state encodes a horizon that goes beyond the mere generation of the next token (for the "past", it encodes the meaning of the conversation so far; for the "future", has already an idea of what it wants to say), this state is only changing slightly at each new inference pass, despite being each time recreated from the context. So during a sequence of (completely independent) token predictions there is an internal state that stays mostly the same, evolving only gradually in a feedback loop with the tokens that are generated at each inference cycle.
Maybe it's not clear what I mean by "state". I mean a pattern of activations in the deep layers of the network that encodes for some high level semantic. Not something that is persisted. Something that doesn't need to be persisted precisely because is fully determined by the context, and the context stays roughly the same.
Secondarily, I feel like it's difficult to make inferences about consciousness though I understand why you would given that the predicate of the reality that you can access is your individual consciousness.
There are countless configurations of reality that are plausible where you're the only "conscious" being but it looks identical to how it looks now.
You're going to tell me you're Claude before we bet, right? In that case, I would bet inversely, as my experience with computers is that so far they've just been increasingly powerful calculators.
Again, I can't be absolutely sure, but fairly certain no calculators have achieved significant consciousness yet, and that's enough to make decisions.
> There are countless configurations of reality that are plausible where you're the only "conscious" being but it looks identical to how it looks now.
I can see that, but how many of those are wildly improbable? We can't abandon pragmatism if we need to make informed decisions, like granting legal rights to machines.
Basically, the reporting machinery is compromised in the same way that with the Müller-Lyer illusion you can "know" the lines are the same length but not perceive them as such.
I don't feel like Kim Dotcom is a reliable or trustworthy source. Most of what he said isn't really surprising though, but this completely stretches credulity:
> Palantir is creating nuclear and bio weapon capabilities for Ukraine and is working closely with the CIA to defeat Russia.
It reads like Kim Dotcom fan fiction. Not even mentioning the gross antisemitism in the replies that he's agreeing with.
I hate Palantir and think they're evil, but this post is just silly.
I've been feeling this SO much lately, in many ways. In addition to security, just the feeling of spending decades learning to write clean code, valuing having a deep understanding of my codebase and tooling, thorough testing, maintainability, etc, etc. Now the industry is basically telling me "all that expertise is pointless, you should give it up, all that we care about it is a future of endless AI slop that nobody understands".
I've been feeling a similar kind of resentment often. My whole life I have prided myself on being the guy that actually bothers to read the docs and understand how shit works. Seems like the whole industry is basically saying none of that matters, no need to understand anything deeply anymore. Feels bad man.
AI slop will collapse under its own weight without oversight. I really think we will need new frameworks to support AI-generated code. Engineers with high standards will be needed to build and maintain the tools and technologies so that AI-written code can thrive. It's not game over just yet
Thanks, I've been feeling the same way. But it seems like we're some years away from the industry fully realizing it. Makes me want to quit my job and just code my own stuff.
Agreed. I find most design patterns end up as a mess eventually, at least when followed religiously. DDD being one of the big offenders. They all seem to converge on the same type of "over engineered spaghetti" that LOOKS well factored at a glance, but is incredibly hard to understand or debug in practice.
reply