Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How is this even possible?



Incase i'm missing something, why wouldn't it be possible?

Claude and Gemini have similar offerings for a similar/same price, i thought. Eg if Claude Code can do it for $200/m, why can't Cerebras?

(honest question, trying to understand the challenge for Cerebras that you're pointing to)

edit: Maybe it's the speed? 2k tokens/s sounds... fast, much faster than Claude. Is that what you're referring to?


He just wrote another way of making an exclamation, like "wow, incredible!".


They make frisbee-sized CPUs.


Indeed. Pretty much all silicon today comes on 12" or so wafers, broken into chip sized pieces, and each chip is tested and the ones that failed are thrown away.

Cerebras uses the entire 12" and builds in redundancy so that with current defect rates a large fraction of the wafers are usable. This allows a huge level of parallelism, a large amount of on board ram, and the removal of the need to move data on/off the wafer. So the available bandwidth is insane and inference is mostly bandwidth limited.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: