These are ~2 years behind state of the art from the looks of it. Still cool that they're releasing anything that's open for researchers to play with, but it's nothing groundbreaking.
No, it is not as good as Veo, but better than Grok, I would say. Definitely better than what was available 2 years ago. And it is only a 7B research model!
> Are other open weight video models also this small?
Apples models are weights-available not open weights, and yes, WAN 2.1, as well as the 14B models, also has 1.3B models; WAN 2.2, as well as the 14B models, also has a 5B model (the WAN 2.2 VAE used by Starflow-V is specifically the one used with the 5B model.) and because the WAN models are largely actually open weights models (Apache 2.0 licensed) there are lots of downstream open-licensed derivatives.
> Can this run on a single consumer card?
Modern model runtimes like ComfyUI can run models that do not fit in VRAM on a single consumer card by swapping model layers between RAM and VRAM as needed; models bigger than this can run on single consumer cards.
The interesting part is they chose to go with a normalizing flow approach, rather than the industry standard diffusion model approach. Not sure why they chose this direction as I haven’t read the paper yet.