If it can be trained on (many) existing games, then it might work similarly to how you don't need to describe every possible detail of a generated image in order to get something that looks like what you're asking for (and looks like a plausible image for the underspecified parts).
Also: define "fun" and "new" in a "simple text prompt". Current image generators suck at properly reflecting what you want exactly, because they regurgitate existing things and styles.