Simply focusing on the "better in every regard" part of the comment. One example...

Simply focusing on the "better in every regard" part of the comment.

One example where Cerebras systems perform well is when a user is interested in training models that require long sequence lengths or high-resolution images.

One example is in this publication, https://www.biorxiv.org/content/10.1101/2022.10.10.511571v2, where researchers were able to build genome-scale language models that can learn the evolutionary landscape of SARS-CoV-2 genomes. In the paper mentions, researchers mention "We note that for the larger model sizes (2.5B and 25B), training on the 10,240 length SARS-CoV-2 data was infeasible on GPU clusters due to out-of-memory errors during attention computation."