Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could you quantify "quite slow"?


A token per second-ish with a Ryzen 7 5800X. If I run it for too long it gets slower as heat throttling kicks in, I need a better cooling system if I'm going to run it non-stop.


i've had the same experience tbh, 7/13/30 on ryzen (local) and intel (server) both on rhel/centos. It's a shame really


For a bit of comparison, if you've tested, how fast are 13B or 7B on the same setup?


Really fast. I didn't bother timing, but they're faster than ChatGPT by a long shot. I didn't spend very long with them because the quality is so much worse than the 65B.

I should probably go back and try again to see if it's worth it for the extra speed, now that I've played with 65B for a while.


Is this with full or empty context?


Good question--I'm counting it from empty and around empty. By the time it gets to full I'm also getting heat throttling (I can tell looking at the temp), so it's hard to know the degree to which the slowdown is one or the other.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: