Accelerators are being developed that claim to get down to 10x, though i think they will be more like 100-1000x, which would still be a huge improvement considering how people use LLMs today for basic tasks like string matching.
Are those accelerators software-only? 10x could let 4$ VPS run server side checks for backup software (evil clients cant clean backups) and git forges (eg, dont allow X to push to main).
Accelerators are being developed that claim to get down to 10x, though i think they will be more like 100-1000x, which would still be a huge improvement considering how people use LLMs today for basic tasks like string matching.