AFAIK the premise is basically “given the last frame and the current input, what is the next frame?” so there’s no object permanence or structure behind it it’s just making a video that for any given span of two consecutive frames looks like a video of minecraft.
AFAIK the premise is basically “given the last frame and the current input, what is the next frame?” so there’s no object permanence or structure behind it it’s just making a video that for any given span of two consecutive frames looks like a video of minecraft.