Has there ever been a time, going back to the 70s with original PCs, where new software didn't necessitate a new computer?
Things are also getting better now that Intel is dying. I mean, the new Apple silicon chips are astoundingly fast and energy efficient, an M1 from 5 years ago is still going strong and probably won't truly need replacing for another 2. Similar for Ryzen chips from 5 years ago!
Things have changed a lot in 20 years. In 2005 we didn't consume all of our video / audio media online. We didn't have social media, just blogs and RSS readers. YouTube had just been released. TikTok, Facebook and Twitter didn't exist. Hypermedia today is very rich and necessitates a lot of resources. But at the same time, most work the past 10 years has been on native apps (on mobile particularly but also PCs), not web sites. Most people don't use the web browser as much.
I think the main hesitancy is due to rampant anthropomorphism. These models cannot reason, they pattern match language tokens and generate emergent behaviour as a result.
Certainly the emergent behaviour is exciting but we tend to jump to conclusions as to what it implies.
This means we are far more trusting with software that lacks formal guarantees than we should be. We are used to software being sound by default but otherwise a moron that requires very precise inputs and parameters and testing to act correctly. System 2 thinking.
Now with NN it's inverted: it's a brilliant know-it-all but it bullshits a lot, and falls apart in ways we may gloss over, even with enormous resources spent on training. It's effectively incredible progress on System 1 thinking with questionable but evolving System 2 skills where we don't know the limits.
If you're not familiar with System 1 / System 2, it's googlable .
Not trying to be a smarty pants here, but what do we mean by "reason"?
Just to make the point, I'm using Claude to help me code right now. In between prompts, I read HN.
It does things for me such as coding up new features, looking at the compile and runtime responses, and then correcting the code. All while I sit here and write with you on HN.
It gives me feedback like "lock free message passing is going to work better here" and then replaces the locks with the exact kind of thing I actually want. If it runs into a problem, it does what I did a few weeks ago, it will see that some flag is set wrong, or that some architectural decision needs to be changed, and then implements the changes.
What is not reasoning about this? Last year at this time, if I looked at my code with a two hour delta, and someone had pushed edits that were able to compile, with real improvements, I would not have any doubt that there was a reasoning, intelligent person who had spent years learning how this worked.
It is pattern matching? Of course. But why is that not reasoning? Is there some sort of emergent behavior? Also yes. But what is not reasoning about that?
I'm having actual coding conversations that I used to only have with senior devs, right now, while browsing HN, and code that does what I asked is being produced.
I think the biggest hint that the models aren't reasoning is that they can't explain their reasoning. Researchers have shown for explained that how a model solves a simple math problem and how it claims to have solved it after the fact have no real correlation. In other words there was only the appearance of reasoning.
People can't explain their reasoning either. People do a parallel construction of logical arguments for a conclusion they already reached intuitively in a way they have no clue how it happened. "The idea just popped into my head while showering" to our credit, if this post-hoc rationalization fails we are able to change our opinion to some degree.
Interestingly people have to be trained in logic and identifying fallacies because logic is not a native capability of our mind. We aren’t even that good at it once trained and many humans (don’t forget a 100 IQ is median) can not be trained.
Reasoning appears to actually be more accurately described as “awareness,” or some process that exists along side thought where agency and subconscious processes occur. It’s by construction unobservable by our conscious mind, which is why we have so much trouble explaining it. It’s not intuition - it’s awareness.
Yeah, surprisingly I think the differences are less in the mechanism used for thought and more in the experience of being a person alive in a body. A person can become an idea. An LLM always forgets everything. It cannot "care"
I'm having actual coding conversations that I used to only have with senior devs, right now, while browsing HN, and code that does what I asked is being produced.
I’m using Opus 4 for coding and there is no way that model demonstrates any reasoning or demonstrates any “intelligence” in my opinion. I’ve been through the having conversations phase etc but doesn’t get you very far, better to read a book.
I use these models to help me type less now, that’s it. My prompts basically tell it to not do anything fancy and that works well.
You raise a far point. These criticisms based on "it's merely X" or "it's not really Y" don't hold water when X and Y are poorly defined.
The only thing that should matter is the results they get. And I have a hard time understanding why the thing that is supposed to behave in an intelligent way but often just spew nonsense gets 10x budget increases over and over again.
This is bad software. It does not do the thing it promises to do. Software that sometimes works and very often produces wrong or nonsensical output is garbage software. Sink 10x, 100x, 1000x more resources into it is irrational.
Nothing else matters. Maybe it reasons, maybe it's intelligent. If it produces garbled nonsense often, giving the teams behind it 10x the compute is insane.
"Software that sometimes works and very often produces wrong or nonsensical output" can be extremely valuable when coupled with a way to test whether the result is correct.
> It does not do the thing it promises to do. Software that sometimes works and very often produces wrong or nonsensical output...
Is that very unlike humans?
You seem to be comparing LLMs to much less sophisticated deterministic programs. And claiming LLMs are garbage because they are stochastic.
Which entirely misses the point because I don't want an LLM to render a spreadsheet for me in a fully reproducible fashion.
No, I expect an LLM to understand my intent, reason about it, wield those smaller deterministic tools on my behalf and sometimes even be creative when coming up with a solution, and if that doesn't work, dream up some other method and try again.
If _that_ is the goal, then some amount of randomness in the output is not a bug it's a necessary feature!
You're right, they should never have given more resources and compute to the OpenAI team after the disaster called GPT-2, which only knew how to spew nonsense.
We already have highly advanced deterministic software. The value lies in the abductive “reasoning” and natural language processing.
We deal with non determinism any time our code interacts with the natural world. We build guard rails, detection, classification of false/true positive and negatives, and all that all the time. This isn’t a flaw, it’s just the way things are for certain classes of problems and solutions.
It’s not bad software - it’s software that does things we’ve been trying to do for nearly a hundred years beyond any reasonable expectation. The fact I can tell a machine in human language to do some relative abstract and complex task and it pretty reliably “understands” me and my intent, “understands” it’s tools and capabilities, and “reasons” how to bridge my words to a real world action is not bad software. It’s science fiction.
The fact “reliably” shows up is the non determinism. Not perfectly, although on a retry with a new seed it often succeeds. This feels like most software that interacts with natural processes in any way or form.
It’s remarkable that anyone who has ever implemented exponential back off and retry, has ever implemented edge cases, and sir and say “nothing else matters,” when they make their living dealing with non determinism. Because the algorithmic kernel of logic is 1% of programming and systems engineering, and 99% is coping with the non determinism in computing systems.
The technology is immature and the toolchains are almost farcically basic - money is dumping into model training because we have not yet hit a wall with brute force. And it takes longer to build a new way of programming and designing highly reliable systems in the face of non determinism, but it’s getting better faster than almost any technology change in my 35 years in the industry.
Your statement that it “very often produces wrong or nonsensical output” also tells me you’re holding onto a bias from prior experiences. The rate of improvement is astonishing. At this point in my professional use of frontier LLMs and techniques they are exceeding the precision and recall of humans and there’s a lot of rich ground untouched. At this point we largely can offload massive amounts of work that humans would do in decision making (classification) and use humans as a last line to exercise executive judgement often with the assistance of LLMs. I expect within two years humans will only be needed in the most exceptional of situations, and we will do a better job on more tasks than we ever could have dreamed of with humans. For the company I’m at this is a huge bottom line improvement far and beyond the cost of our AI infrastructure and development, and we do quite a lot of that too.
If you’re not seeing it yet, I wouldn’t use that to extrapolate to the world at large and especially not to the future.
>I think the main hesitancy is due to rampant anthropomorphism. These models cannot reason, they pattern match language tokens and generate emergent behaviour as a result
This is rampant human chauvinism. There's absolutely no empirical basis for the statement that these models "cannot reason", it's just pseudoscientific woo thrown around by people who want to feel that humans are somehow special. By pretty much every empirical measure of "reasoning" or intelligence we have, SOTA LLMs are better at it than the average human.
There's nothing accelerationist about recognising that making unfalsifiable statements about LLMs lacking intelligence or reasoning ability serves zero purpose except stroking the speaker's ego. Such people are never willing to give a clear criteria for what would constitute proof of machine reasoning for them, which shows their belief isn't based on science or reason.
I guess your work doesn't involve any maths then, because then you'd see they're capable of solving maths problems that require a non-trivial amount of reasoning steps.
Just the other day I needed to code some interlocked indices. It wasn't particularly hard but I didn't want to context switch and think so instead I asked gpt 4o. After a back and worth for 4 or 5 times, where it gave wrong answers I finally decided to just take a pen and paper and do it by hand. I have a hard time believing that these models are reasoning, because if they are they are very poor at it.
Anybody I’ve ever demoed this product to has the opposite impression: it would replace their iPad and their TV easily if they could afford it and it was a bit lighter. I play Xbox and PS5 in it all the time ; it isn’t a rival to those products. It’s a preview of the future of all computing.
Siri on visionOS is actually quite refreshing for its speech to text, it is almost entirely on device and very low latency, with much more accuracy than you see on iOS. The combination of a keyboard, trackpad, Siri, hand and eye tracking is incredibly high information rate input for me and it’s what I use for most of the day for the past eight months or so.
Their priority is not selling a mass market low margin product. That is Meta‘s strategy, and they’ve lost nearly $100b because they think it’s the future of all computing. The thing is, Apple agrees. But they’re not the kind of company to burn that kind of capital.
Vision Pro was all about selling an enthusiast device that pushes the boundaries of XR technology into what they thought was appropriate baseline that would shift the market. They succeeded at that, the entire market is changing their strategy to respond to visionOS. visionOS has set the baseline for spatial computing so much that even horizon OS is copying it now. Apple takes the product line very seriously, they they’re just playing a different game than you want them to play.
I mean, I don't care what they do (other than as a shareholder, lol. But it's not a major part of my portfolio.)
I just do not think they have made an impact on the mass-market -- and at their market cap, anything short of mass-market should be considered a failure and a distraction from products that actually sell.
"Horizon OS copying them" is flattering I guess, but they're not copying all the stupid things about AVP: The heavy, expensive metal construction, the silly outer display, the stupid tethered proprietary battery, the $3500 price tag. I do have a Quest 3 though, which is vaguely fun, but was an impulse buy I only occasionally use.
I've never even tried the AVP, and while that seems like a disqualification of me as a judge of it, that's just the point: I'm a geek. If even I dismiss it as a useless and overpriced toy, it will never be mass market, because normal people need more of a justification than I do to adopt a gadget. It needs to do something amazing that people immediately see the value of. Which is why I cited courtside NBA games (not a 10-minute short btw) as an example.
If Apple's 'game' is to make a niche device with no important apps and about 5,000 MAUs then they're playing it great.
Book a demo at the Apple Store. It’s the kind of product that seems like an overpriced expensive toy until you actually experience it.
People who follow the XR industry know that most aspects of the AVP were very carefully considered engineering and design trade-offs, including the aluminum construction, which is arguably lighter than plastic for the nature of the headset design requiring a certain level of durability and recyclability. Tethered battery is also a very smart design decision, that I think we will see followed by other manufacturers. The outwardfacing display is necessary if the headset is to be integrated in the workplace or in a social environment, such as cafés or airplanes. In my experience, my family and coworkers appreciate it.
The battery being connected with a proprietary plug is a very smart decision? They couldn’t have used USB-C? Is this like Lightning? Because they admitted after 8 years of fighting it that usb-c made a lot more sense there too.
For me that part is just proof that they are determined, even after you spend $3.5k, to nickel and dime you: the only way to get increased battery life is either to buy additional hundred plus dollar (so, marked up 10x from their cost) batteries from Apple, or to daisy-chain the heavy battery to your own heavy battery (or to the darn wall.)
They did use USB-C, on the battery itself. You literally can use the headset, and most do, with it plugged in. The exception is when you’re doing room scale, immersive VR, or when you’re walking around the house or office,but that’s generally within the time life of the battery. The connecter on the headset itself is flat with a lock, so that the cord runs towards the back of your head and doesn’t disconnect by mistake. Similar to MagSafe, it is well designed. Last I checked, daisychaining batteries is exactly how the Meta Quest does it with the elite strap, and also how all iPhones and iPads work, so I’m not sure what the problem is.
You’re covering this board with statements of rumor, personal opinion and conjecture, as if they were statements of fact and getting the rumours wrong consistently. The Vision Pro leader, Mike Rockwell, runs both visionOS and Siri/AI. Many of his lieutenants are working on Siri, while many are staying on visionOS. The architectural approach and UX of visionOS is infecting the entire software group.
Goggles have not bombed, and are not going away for at least a decade. Apple will not be abandoning visionOS for spectacles, I know you keep repeating this, but it’s false.
And yet I have people constantly telling me how they have four monitors set ups and still not enough screen real estate… and this is why they don’t like the Vision Pro, which can only give you one big ultra wide max.
I've yet to meet someone with strong preference for screen real estate that could back it up with productivity. Sometimes people just want stimulation.
Edit: i have no gripe with these people, I just simply don't buy that they're more productive. We all need our comforts. Mine is music.
It does a lot more than iPad apps. Coexisting 3-D volumetric apps with 2D app is quite innovative and not even an android XR will be able to do that in its first release. What’s software development tools do you think it’s missing?
They had about 450 K units in the wild in the first year. They will sell around 600 K units of the first version until the second one is released later this year. This was on a potential production target of 900 K units. So it’s not exactly a failure so much as an underperform due to them over pricing it for supply constraints.
VisionOS is clearly a triumph and the basis of UX for their entire product line. The scepticism is gonna age like those who were sceptical about the original Mac and Windows. Vision Pro isn’t even all that expensive by historical standards.
It didn’t bomb in the market. It was priced for supply constraints. They basically tied Meta on all headset revenue in its first year (450k units and $1.4 billion). visionOS is a triumph and foundation for the next decade of computing, and the entire product line of Apple is adapting to its look and feel. The entire XR / VR industry is changing their strategy to respond to visionOS , particularly Meta horizon OS which has incorporated numerous improvements that were directly inspired by Apple. And plenty of Apple products sold a lot less than 10 million annually to this day.
Things are also getting better now that Intel is dying. I mean, the new Apple silicon chips are astoundingly fast and energy efficient, an M1 from 5 years ago is still going strong and probably won't truly need replacing for another 2. Similar for Ryzen chips from 5 years ago!
Things have changed a lot in 20 years. In 2005 we didn't consume all of our video / audio media online. We didn't have social media, just blogs and RSS readers. YouTube had just been released. TikTok, Facebook and Twitter didn't exist. Hypermedia today is very rich and necessitates a lot of resources. But at the same time, most work the past 10 years has been on native apps (on mobile particularly but also PCs), not web sites. Most people don't use the web browser as much.