I wouldn't run local models on the development PC. Instead run them on a box in another room or another location. Less fan noise and it won't influence the performance of the pc you're working on.
Latency is not an issue at all for LLMs, even a few hundred ms won't matter.
It doesn't make a lot of sense to me, except when working offline while traveling.
I'm not fully convinced that those devices don't create noise at full power. But one issue still remains: LLMs eating up compute on the device you're working on. This will always be noticeable.
Latency is not an issue at all for LLMs, even a few hundred ms won't matter.
It doesn't make a lot of sense to me, except when working offline while traveling.