Interesting. So do I get this right that if you use such a model, you essentially don't have much control over the output other than that it's similar to your training data, because your input is just white noise? Or is there a way to bundle this with another model that would allow you to generate images based on inputs like 'dog with party hat'?