Magvit: Masked Generative Video Transformer

thoweiu4o2i3434 · on June 23, 2023

Exciting decade for video encoding; we'll probably see new codecs which aren't still stuck in the 80s computer-vision by the end of the 2020s.

QuantumG · on June 23, 2023

4 months and still no code or models released.

pbronez · on June 23, 2023

Looks like it's stuck in Google's release review process. Author hopes to publish this week, per https://github.com/MAGVIT/magvit/issues/2#issuecomment-15981...

===========

Lijun-Yu commented Jun 20, 2023

I hate the delay in the code release due to company review policies... but hopefully it will be out this week during CVPR.

The initial version will live at https://github.com/google-research/magvit (not yet online as of 06/19), written in Jax/Flax. We are also going to release model weights trained on non-proprietary datasets, along with generated samples, so long as they're approved.

I'm also happy to help with any potential Pytorch reimplementations.

===========

famouswaffles · on June 23, 2023

Lol "Code available soon...forever" is a meme at this point.

amelius · on June 23, 2023

Someone should generate an animated gif for this meme.

piyh · on June 23, 2023

The text to video with a base image seems like a extremely powerful use case to me. Midjourney for the high fidelity "Will Smith eating spaghetti", and continue it with Magvit would yield much better results than the current state of the art.