The present AI scenario is so cool, with audio techniques like vector quantization being employed in language and image generation, while language and image generation techniques are utilized in audio tasks. It seems that a unified architecture is on the horizon...