It's the camera-only mandate, and it's not Elon's but Karpathy's.
Any engineering student can understand why LIDAR+Radar+RGB is better than just a single camera; and any person moderately aware of tech can realize that digital cameras are nowhere as good as the human eye.
Digital cameras are much worse than the human eye, especially when it comes to dynamic range, but I don't think that's all that widely known actually. There are also better and worse digital cameras, and the ones on a Waymo are very good, and the ones on a Tesla aren't that great, and that makes a huge difference.
Beyond even the cameras themselves, humans can move their head around, use sun visors, put on sunglasses, etc to deal with driving into the sun, but AVs don't have these capabilities yet.
You can solve this by having multiple cameras for each vantage point, with different sensors and lenses that are optimized for different light levels. Tesla isn't doing this mind you, but with the use of multiple cameras, it should be easy enough to exceed the dynamic range of the human eye so long as you are auto-selecting whichever camera is getting you the correct exposure at any given point.
Tesla claims that their cameras use "photon counting" and that this lets them see well in the dark, in fog, in heavy rain, and when facing bright lights like the sun.
Photon counting is a real thing [1] but that's not what Tesla claims to be doing.
I cannot tell if what they are doing is something actually effective that they should have called something other than "photon counting" or just the usual Musk exaggerations. Anyone here familiar with the relevant fields who can say which it is?
Here's what they claim, as summarized by whatever it is Google uses for their "AI Overview".
> Tesla photon counting is an advanced, raw-data approach to camera imaging for Autopilot and Full Self-Driving (FSD), where sensors detect and count individual light particles (photons) rather than processing aggregate image intensity. By removing traditional image processing filters and directly passing raw pixel data to neural networks, Tesla improves dynamic range, enabling better vision in low light and high-contrast scenarios.
It says these are the key aspects:
> Direct Data Processing: Instead of relying on image signal processors (ISPs) to create a human-friendly picture, Tesla feeds raw sensor data directly into the neural network, allowing the system to detect subtle light variations and near-IR (infrared) light.
> Improved Dynamic Range: This approach allows the system to see in the dark exceptionally well by not losing information to standard image compression or exposure adjustments.
> Increased Sensitivity: By operating at the single-photon level, the system achieves a higher signal-to-noise ratio, effectively "seeing in the dark".
> Elimination of Exposure Limitations: The technique helps mitigate issues like sun glare, allowing for better visibility in extreme lighting conditions.
> Neural Network Training: The raw, unfiltered data is used to train Tesla's neural networks, allowing for more robust, high-fidelity perception in complex, real-world driving environments.
all the sensor has to do is keep count of how many times a pixel got hit by a photon in the span of e.g. 1/24th of a second (long exposure) and 1/10000th of a second (short exposure). Those two values per pixel yield an incredible dynamic range and can be fed straight into the neural net.
The IMX490 has a dynamic range of 140dB when spitting out actual images. The neural net could easily be trained on multiexposure to account for both extremely low and extremely high light. They are not trying to create SDR images.
Please lets stop with the dynamic range bullshit. Point your phone at the sun when you're blinded in your car next time. Or use night mode. Both see better than you.
I have enjoyed Karpathy's educational materials over the years, but somehow missed that he was involved with Tesla to this degree. This was a very insightful comment from 9 years ago on the topic:
> What this really reflects is that Tesla has painted itself into a corner. They've shipped vehicles with a weak sensor suite that's claimed to be sufficient to support self-driving, leaving the software for later. Tesla, unlike everybody else who's serious, doesn't have a LIDAR.
> Now, it's "later", their software demos are about where Google was in 2010, and Tesla has a big problem. This is a really hard problem to do with cameras alone. Deep learning is useful, but it's not magic, and it's not strong AI. No wonder their head of automatic driving quit. Karpathy may bail in a few months, once he realizes he's joined a death march.
Using only cameras is a business decision, not tech decision: will camera + NN be good enough before LIDAR+Radar+RGB+NN can scale up.
For me it looks like they will reach parity at about the same time, so camera only is not totally stupid. What's stupid is forcing robotaxi on the road before the technology is ready.
Nah, Waymo is much safer than Tesla today, while Tesla has way-mo* data to train on and much more compute capacity in their hands. They're in a dead end.
Camera-only was a massive mistake. They'll never admit to that because there's now millions of cars out there that will be perceived as defective if they do. This is the decision that will sink Tesla to the ground, you'll see. But hail Karpathy, yeah.
"Myopia has reached near-epidemic levels worldwide, yet we still don't fully understand why," said Jose-Manuel Alonso, MD, Ph.D., SUNY Distinguished Professor and senior author of the study."
It's because you focus on nearer things much more time than things which are far away. One does not need a PhD and the whole kitchen sink to notice that.
Why does focusing on nearer things cause myopia? See if you're curious at even a basic level, you'd realize that there are important *details* about stuff like this where it actually helps to have some actual subject matter expertise and knowledge.
I believe during a certain age range your eyes determine they've grown to the correct size based on how well they focus, and ancestral humans mostly focused at far away things. When we spend lots of time indoors and looking at screens, our eyes adapt to this as the default focus. Since they're also evolved to look at things nearer than the default focus but not farther (since it's meant to be at infinity), this creates myopia.
That's a lovely theory, if quite imprecise in terms of the actual biology of eye development. The actually important part of science (the part that requires a lot of expertise and judgement) is figuring out how to make that an actually testable hypothesis and then whether or not its true.
> a certain age range your eyes determine they've grown to the correct size based on how well they focus
A certain age -> which one? Why?
your eyes determine -> How? What molecular growth signaling pathways are involved? How do they integrate with your brain's visual processing centers and how does that relate to "how well [your eyes] focus". Is there a biomechanical signal from muscle stress or eye curvature?
How would you test this? You'd have to change this process somehow to show that the effect is real, but you obviously can't do that with humans, so you'd probably have to use mice, but their eyes are different, but how so?
Without any of this information, it's a nice "just-so" story about cavemen looking at the horizon, but not much more than that.
Only non IP protocols I can think of are proprietary zigbee protocols for local communication with devices, and lora mesh radio protocols like MeshCore.
As a European, it seems absurd to me one would celebrate the short term benefits of being one of the by far most destructive (per capita) countries on earth regarding global climate (challenged only by a few oil states).
Is a temporary advantage worth destroying the planet forever?
Same with that "MIT" interviewer who wasn't even at MIT.
And that girl Altoff ...
Literal nobodies suddenly interviewing Elon Musk, etc... within weeks.
Things rarely go "viral" on their own these days, everything is controlled, even who gets the stage, how the message is delivered, etc... as you have noticed.
With regards to who's behind, well, we might never know. However, as arcane as it might sound, gradient descent can take you close to the answer, or at least point you towards it.
I like this recent meme of Christof from Truman Show saying things like "now tell them that there's aliens" or crap like that.
Lex’s position at MIT would make sense for a grad student or perhaps someone early in their career as an academic. But Lex is neither a student nor faculty member at MIT. So what’s he doing? This type of thing is usually unpaid or low paying for non-faculty.
Lex got his PhD at Drexel over a decade ago. If he had pursued an academic career, he would most likely be an associate professor by now. Working as a researcher at a lab at a university that you aren’t a faculty member of is basically “failure to launch” at this stage.
But Lex is a successful podcaster. His dad is a successful academic and scientist (at Drexel.) Lex is not that, but he plays one on the internet.
His paper on Tesla was widely panned as being not academically rigorous and more of an advertisement.
The rest are at least 6 years old.
So what is he doing as a research scientist. Don’t get me wrong - I like his podcast. I think he gets good guests. But he’s not doing any level of research.
Whatever you do please DO NOT look up these links on the Internet Archive.
Not just that but I would also suggest to stop using the Internet Archive in general, as it is obviously not a reliable source of truth like Wikipedia or many news outlets with specialized people that spend a non-trivial amount of their time carefully checking all of this information.
A lot of people believe that Fridman is not affiliated with MIT even though the university says it is. <https://lex.mit.edu/> It's a recurring thing in the Talk page for the Wikipedia article.
Nah, that's just reddit. At this point it's safer to take anything that's popular on reddit as either outright wrong or so heavily out of context that it's not relevant.
Oh, sure, I learned a long time ago that Reddit is a very reliable anti-indicator. But given that HN isn't nearly as bad (but there are moments), it's still strange that people would just repeat something about someone else that they could disprove for themselves in 30 seconds.
If the point of learning a language was to say new words to yourself, you can just make up words.
If you want to be understood and understand others, who ever "they" are sort of need to exist while you're learning.
I can promise you, speaking out of a phrase book burned into your brain with limited cultural knowledge from other people makes for a very boring cringeworthy conversation partner, and an awful language teacher.
Any engineering student can understand why LIDAR+Radar+RGB is better than just a single camera; and any person moderately aware of tech can realize that digital cameras are nowhere as good as the human eye.
But yeah, he's a genius or something.
reply