Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Magvit: Masked Generative Video Transformer (cmu.edu)
114 points by mfiguiere on June 23, 2023 | hide | past | favorite | 6 comments


Exciting decade for video encoding; we'll probably see new codecs which aren't still stuck in the 80s computer-vision by the end of the 2020s.


4 months and still no code or models released.


Looks like it's stuck in Google's release review process. Author hopes to publish this week, per https://github.com/MAGVIT/magvit/issues/2#issuecomment-15981...

===========

Lijun-Yu commented Jun 20, 2023

I hate the delay in the code release due to company review policies... but hopefully it will be out this week during CVPR.

The initial version will live at https://github.com/google-research/magvit (not yet online as of 06/19), written in Jax/Flax. We are also going to release model weights trained on non-proprietary datasets, along with generated samples, so long as they're approved.

I'm also happy to help with any potential Pytorch reimplementations.

===========


Lol "Code available soon...forever" is a meme at this point.


Someone should generate an animated gif for this meme.


The text to video with a base image seems like a extremely powerful use case to me. Midjourney for the high fidelity "Will Smith eating spaghetti", and continue it with Magvit would yield much better results than the current state of the art.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: