Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting. So do I get this right that if you use such a model, you essentially don't have much control over the output other than that it's similar to your training data, because your input is just white noise? Or is there a way to bundle this with another model that would allow you to generate images based on inputs like 'dog with party hat'?


In this formulation, no, you have no control over the output other than the fact that it is similar to your training data.

If you need to have control over the generated image, you would need to use a conditional diffusion model. https://arxiv.org/pdf/2111.13606.pdf


Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: