For comparison, Groq [1] has (price per million tokens of input vs output): Llam...

BoorishBears · on March 13, 2024

And zero capacity. Groq is coming across a total paper tiger. No billing, unusable rate limits, and most importantly: a request queue that makes it dramatically slower than any other option.

They say they're just waiting on implementing billing, but at this point it reads more like "we wouldn't be able to meet demand of all your request usages".

-

Groq is going through all that to offer 500tk/s theoretically, meanwhile I'm seeing Fireworks.ai come in at 300+tk/s in production use.