Full disclosure, we have a contract with AMD to get Llama 405B training on MI350...

latchkey · 2025-11-15T02:40:14 1763174414

As CEO of an AMD NeoCloud for the past 2 years, it is so nice to hear all this and also see the turn around. It is what I bet my business on from the start and I can concur with what George is saying 100%.

The out of box experience can be a bit rough around the edges on bleeding edge stuff, but it isn't anything near as bad as it used to be. For example, a month ago nanochat wasn't working well and now it is. The important thing is that people now care enough to make it work.

At the end of the day, AI does need viable options. Having a monopoly on all AI hardware and software might be a good thing for share holders, but isn't a good thing for what is looking like a fundamental technology, akin to the internet.

ivape · 2025-11-15T05:57:42 1763186262

That’s interesting, I was specifically looking for AMD hardware being offered by neoclouds, they seem to be rare.

I like your bet though. The difference between NVDA and AMD has never really existed on a hardware level for decades. AMD has always been on par, and software is software, it will catch up.

AMD will be a stock many people will miss because the opportunity has presented itself at the height of AI bubble talk, and this will leave many in the dust. Doubling and tripling of their market cap is pretty much a forgone conclusion.

latchkey · 2025-11-15T07:52:39 1763193159

You're right, it is a much smaller ecosystem, but I think that is partly intentional as a way to focus efforts and not feed into the bubble, which I feel is a smart move. These are the official partners [0]. I'm Hot Aisle.

George was very smart, $500k in the $90's. I saw it coming even earlier than him, but that's cause I was already aware the hardware was good from my own experiences.

[0] https://www.amd.com/en/products/accelerators/instinct/eval-r...

LogicFailsMe · 2025-11-16T02:29:18 1763260158

Will it catch up or will it forever chase nvidia's tail? I'm betting on the latter unless another AI winter happens. And contrary to anti-generative AI social media talking points, the literature suggests The Red Queen's race is continuing apace IMO.

Nvidia remains undefeated at responding to hardware threats with hardware diving catches to this day. What scenario prevents them from yet another one of their diving catches? I'm genuinely curious as to how one could pull that off. It's like challenging Google in search: even if you deliver better product and some have, the next thing you know Google is doing the same thing or better with deeper pockets.

ivape · 2025-11-16T16:42:05 1763311325

Nvidia remains undefeated at responding to hardware threats with hardware diving catches to this day. What scenario prevents them from yet another one of their diving catches?

The fact that they make roughly the same hardware as AMD for the last 2 decades, and even today. There was no diving catch, AMD just ignored what the hardware was capable of and didn't reinforce OpenCL. There was literally no diving catch. For example, just in this thread alone, AMD paid someone to make this shit work on their hardware. Don't bet against what's coming.

LogicFailsMe · 2025-11-18T15:43:54 1763480634

Except no, AMD 100% played follow the leader with technology like CUDA, NVLink, and tensor cores.

Even paying paying someone in academia to get s** to work on their hardware is yet another example of follow the leader.

What exactly do you think is coming? I think the biggest threat is one or more Chinese companies catching up on both hardware and ecosystem in the next half decade or so myself, mostly because of the state level support for making that so. But I absolutely don't expect an x86_64 moment for GPUs here given past results and the current bias against software in AMD's HW culture. Convince me otherwise.

WithinReason · 2025-11-15T09:43:33 1763199813

How far is Tinygrad from being able to represent/search the kind of optimisations listed in the article? i.e.:

  1. data layouts to avoid local memory bank conflicts
  2. read patterns from global memory to optimize L2 cache reuse
  3. warp specialisation

How complex is it to add these into tinygrad?

georgehotz · 2025-11-15T17:22:40 1763227360

1 and 2 are supported, 1 you need to specify, 2 will be found with BEAM. We are working on reimplementing HipKittens in tinygrad, all the stuff is there to do it. See the amd_uop_matmul example.

tinygrad doesn't support 3 yet, it's not needed on any AMD GPUs, and not needed on NVIDIA consumer. It wouldn't be hard to add, but it's important to figure out how it best fits with the existing abstractions. I think everything will eventually move to a more producer-consumer model.

0-_-0 · 2025-11-15T19:27:56 1763234876

Good luck with the AMD contract! I imagine HipKittens came at just the right time.

fulafel · 2025-11-15T05:24:28 1763184268

Does consumer hardware (non-MI) need proprietary kernel drivers for running rocm + pytorch?

kieranl · 2025-11-15T16:40:13 1763224813

No. But you might need a specific version of rocm built for your gpu. These are built on https://github.com/ROCm/TheRock

Right now AI support on AMD is officially only on specific models. But they are working hard to turn this around to have broader support. And making progress.

fulafel · 2025-11-15T18:58:53 1763233133

Vulkan compute is also getting some good press as a local llm platform (at least on the linux side), will be interesting to see which crosses the line to "can ship production quality apps on this" first.

georgehotz · 2025-11-15T08:07:17 1763194037

Nope! Works fine with in-tree somewhat recent kernel. The AMD driver is actually open source, not just a wrapper into a big on device blob like the NVIDIA one. tinygrad also has a driver that doesn't even need the kernel module, just mmapping the PCIe BAR into Python.

buckle8017 · 2025-11-15T18:05:57 1763229957

> Cerebras isn't available anywhere.

That sounds like they're winning.