Atomovideo: High Fidelity Image-to-video Generation | Awesome LLM Papers Add your paper to Awesome LLM Papers

Atomovideo: High Fidelity Image-to-video Generation

Litong Gong, Yiran Zhu, Weijie Li, Xiaoyang Kang, Biao Wang, Tiezheng Ge, Bo Zheng . No Venue 2024

[Other] [Paper]   Search on Google Scholar   Search on Semantic Scholar
Datasets Evaluation Has Code Model Architecture RAG Tools Training Techniques

Recently, video generation has achieved significant rapid development based on superior text-to-image generation techniques. In this work, we propose a high fidelity framework for image-to-video generation, named AtomoVideo. Based on multi-granularity image injection, we achieve higher fidelity of the generated video to the given image. In addition, thanks to high quality datasets and training strategies, we achieve greater motion intensity while maintaining superior temporal consistency and stability. Our architecture extends flexibly to the video frame prediction task, enabling long sequence prediction through iterative generation. Furthermore, due to the design of adapter training, our approach can be well combined with existing personalised models and controllable modules. By quantitatively and qualitatively evaluation, AtomoVideo achieves superior results compared to popular methods, more examples can be found on our project website: https://atomo- video.github.io/.

Similar Work