We provide visual examples for reconstruction at 32x temporal compression (8x8x32) with 8 latent channels and we compare with MAGVIT-v2 at the resolution of 256x256 as MAGVIT-v2 faces out of memory issue at 512x512. We increase the temporal compression rate to the extreme case of 32x where our method maintains the reconstruction ability to a great extent compared to MAGVIT-v2.
Reference
MAGVIT-v2
REGEN