I consider the RTX 4060 Ti as the best entry level GPU for running small models. It has 16GBs of RAM which gives you plenty of space for running large context windows and Tensor Cores which are crucial for inference. For larger models probably multiple RTX 3090s since you can buy them on the cheap on the second hand market.
I don’t have experience with AMD cards so I can’t vouch for them.
I don’t have experience with AMD cards so I can’t vouch for them.