Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I consider the RTX 4060 Ti as the best entry level GPU for running small models. It has 16GBs of RAM which gives you plenty of space for running large context windows and Tensor Cores which are crucial for inference. For larger models probably multiple RTX 3090s since you can buy them on the cheap on the second hand market.

I don’t have experience with AMD cards so I can’t vouch for them.



I know nothing about gpus. Should I be assuming that when people say "ram" in the context of gpus they always mean vram?


Not always, because system RAM also has to be equally adequate, but mostly yes, it's about the total VRAM of the GPU(s).


“GPU with xx RAM” means VRAM, yes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: