Hacker Newsnew | past | comments | ask | show | jobs | submit | memossy's commentslogin

It was interesting Aurora used GPU Max & definitely looking forward to Falcon Shores.

I think Gaudi2 was bad timed & they had to build stack, Gaudi3 is where I think we will see mass adoption given availability, way cheaper price/performance & maturer stack.

There is still weird stuff when using them but they are surprisingly solid.


We use v4s, v5es & v5ps. Mostly v5ps, very stable int8 training (versus the horror that is fp8 stability)


I mean they work well, here is another blog by Databricks: https://www.databricks.com/blog/llm-training-and-inference-i...


I think as we go to enterprise workloads the total cost of ownership becomes important.

NVIDIA is still the best for research given ecosystem but once the models are standardised as with transformers/LLaMA and likely multimodal diffusion transformers it then becomes about scale, availability and cost per flop.


It took less than a day to port our code over, we do custom CUDA across modalities.

Gaudi2 was actually announced 2 years ago and is 7nm like the A100 80Gb it was meant to be competitive with, Gaudi3 later this year is probably going to be the inflection point as that ramps

The cost is like 1/3

https://www.intel.com/content/www/us/en/newsroom/news/vision...


"Announced" 2 years ago is different from its availability and ability:

- Intel acquired Habana in 2019

- Habana launched Gaudi2 in 2022

- only in H2 2023 Habana enabled FP8 which delivered around 100% improvement in time-to-train

On the rest I believe you but markets don't move based on single individual's/company's data points


Gaudi2s started coming out in 2022 (https://huggingface.co/blog/habana-gaudi-2-benchmark) but didn't hit mass scale. I think Gaudi3 will & others have seen similar performance for Gaudi2 eg Databricks: https://www.databricks.com/blog/llm-training-and-inference-i..., mlperf etc

We are about to drop stable diffusion 3 which is the best image model out there (https://x.com/EMostaque/status/1764941367682256950?s=20) with similar architecture to Sora by OpenAI that can be used for any modality.

We have hundreds of millions of downloads of our models so are looking for big scale as we move to every pixel being generated & this stuff goes from research to mass deployment.


  2024:  Nvidia's B100 TSMC 3nm (?)
  2024:  Intel Gaudi3  TSMC 5nm (*)
  2023:  AMD MI300X    TSMC 5nm/6nm 
  2022:  Nvidia H100   TSMC 4N
  2020   Nvidia A100   TSMC 7nm

(*): performance critical chiplets at least.


Falcon shores next year will be crazy with 300gb VRAM & new lith


The v5es and v5ps are pretty amazing at running SD, giving code for SD3 now to optimise it on those.

v5es are particularly interesting given the millions that will land and the large pod sizes, particularly well constructed for million token context windows.


Think it'll probably crack on with Gaudi3 at 4x performance, twice VRAM etc later this year.

We found cuda sycl conversion surprisingly good https://www.intel.com/content/www/us/en/developer/articles/t...


It's hard to guess these cards' real performance uplifts. According to Nvidia, H100 is 11x faster than A100, but that's definitely not true in most cases. If Gaudi3 is legitimately 4x faster than Gaudi2, it should be a very good value proposition compared to even the B100. I'm really curious whether Intel will be able to compete with X100 using Falcon Shores or not. Regardless, I don't think Nvidia's margins are sustainable.


"For Stable Diffusion 3, we measured the training throughput for the 2B Multimodal Diffusion Transformer (MMDiT) architecture model. Gaudi 2 trained images 1.5x faster than the H100-80GB, and 3x faster than A100-80GB GPU’s when scaled up to 32 nodes. "


I can feel the NVDA stock slipping as we speak…

It has been amazing watching the groupthink at work on that stock when we just saw the same group do it on TSLA to disastrous effect. A similar no moat situation where they simply can’t imagine competitors ever existing.


Typically stocks fall once I buy them and go up after I sell them. I am not planning on buying NVDA for now, so likely it will keep going up.

* just to be clear - this is a joke


My father in law was telling me the other day that he was going buy some NVIDIA stock because it is going to go up to 1,400


The Model Y is the best selling car in the world in 2023. Those of us who were buying in 2019 are still up quite a bit even though the stock was higher at one point. RIVN, Ford, GM all are losing a lot of money on every EV they sell. We were right to bet on TSLA being a major winner.

I actually put 40% of my TSLA into NVDA last year, because the demand for AI hardware is going to keep going up. I'm not saying the stock will never go down, I'm sure it will be volatile, but don't confuse short term volatility with long term technologic transformations.


> The Model Y is the best selling

other manufacturers ship dozens of different models and then you have companies like VW or Stellantis that havd many different brands that basically sell the same model with slightly different chassis, styling etc. so it’s hardly comparable.

An anyway, as far as valuations go margins are as or even way more important than total numbers of cars shipped. Tesla had to cut prices and that didn’t work out that great for their stock price

> volatility with long term technologic transformations.

Intel’s stock peaked in 2000 despite most of the related technologic transformations happening in the subsequent decades, them basically becoming a monopoly and their revenue increasing multiple times.


It's a great company & will do well, plenty of demand & B100s/BH200s etc coming

The Hopper stuff is particulalry interesting


They of course will likely do great, that doesn’t mean their stock price can’t be massively inflated. current valuation is pricing in both massive growth and insane (understatement) margins. Which basically means that they are expected to have no actual competition for years. That’s bot impossible but surely Intel/AMD can’t be this incompetent when there are piles of money just there for the taking.


If we trained it with videos yes but need more GPUs for that.


It'll be out soon, doing benchmark tests etc


Thanks.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: