Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wow nice!! That's a really good deal for that much hardware.

How many tokens/s do you get for DeepSeek-R1?



Thanks, it was a bit of a gamble at the time (lots of dodgy ebay parts), but it paid off.

R1 starts at about 10t/s on an empty context but quickly falls off. I'd say the majority of my tokens are generating around 6t/s.

Some of the other big MoE models can be quite a bit faster.

I'm mostly using QwenCoder 480b at Q8 these days for 9t/s average. I've found I get better real-world results out of it than K2, R1 or GLM4.5.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: