Abstract
Character Animation aims to generating character videos from still images through driving signals. Currently, diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities. However, challenges persist in the realm of image-to-video, especially in character animation, where temporally maintaining consistency with detailed information from character remains a formidable problem. In this paper, we leverage the power of diffusion models and propose a novel framework tailored for character animation. To preserve consistency of intricate appearance features from reference image, we design ReferenceNet to merge detail features via spatial attention. To ensure controllability and continuity, we introduce an efficient pose guider to direct character’s movements and employ an effective temporal modeling approach to ensure smooth inter-frame transitions between video frames. By expanding the training data, our approach can animate arbitrary characters, yielding superior results in character animation compared to other image-to-video methods. Furthermore, we evaluate our method on benchmarks for fashion video and human dance synthesis, achieving state-of-the-art results.
Paper: https://arxiv.org/pdf/2311.17117.pdf
ProjectPage: https://humanaigc.github.io/animate-anyone/
as if we didn’t have infinite TikTok dances already
You have no idea what’s coming. Or maybe you do. You can probably do more with what’s in this paper.
ignoring the societal ramifications for a second, the future of entertainment is going to be insane.
I’ve been thinking up a sort of infinite The Elder Scrolls / Rimworld hybrid video game for the past 15 years or so and it’s always been a pipe dream. Mainly because it would have been impossible to get enough content/assets to make it work.
But by now it’s pretty much inevitable lol.
One person will be able to do the work of an entire games/animation studio. And eventually it’ll all be fully AI created anyway.
I expected this to happen maybe in 2050 or something not now lmao