IDOL Creates Photorealistic 3D Avatars from Single Images

Top post
From Single Images to Photorealistic 3D Avatars: IDOL Revolutionizes Human Representation
Creating animatable, photorealistic, full-body 3D avatars from a single image presents a challenge due to the diverse range of human appearances, poses, and the limited availability of high-quality training data. IDOL, a new method based on a feedforward transformer model, promises a solution, enabling fast and high-quality 3D human reconstruction.
The HuGe100K Dataset: Foundation for Realistic Avatars
A key factor in IDOL's success is the custom-developed HuGe100K dataset. This comprises 100,000 diverse photorealistic datasets of humans. Each dataset contains 24 views in specific poses, generated using a pose-controllable image-to-multi-view model. This enormous amount of data and the variety of views, poses, and appearances form the basis for training the IDOL model and enable high generalizability.
IDOL: An Innovative Transformer Model
The heart of IDOL is a scalable feedforward transformer model. This model is trained to predict a 3D Gaussian representation in a unified space from a given human image. Human pose, body shape, clothing geometry, and texture are captured separately. The resulting Gaussian representations can be animated without post-processing, significantly simplifying the workflow.
Fast and Efficient Reconstruction
IDOL is characterized by its speed and efficiency. The reconstruction of photorealistic humans in 1K resolution from a single input image is possible in seconds with a single GPU. This opens up new possibilities for real-time applications in areas such as virtual reality, gaming, and 3D content creation.
Versatile Application Possibilities
The 3D avatars created with IDOL are not only photorealistic and animatable but also offer diverse editing options. Both shape and texture can be adjusted, expanding the creative possibilities. Furthermore, IDOL seamlessly supports various applications in the fields of graphics, vision, and beyond.
Future Perspectives
IDOL represents a significant advance in the field of 3D human reconstruction. The combination of a comprehensive, generated dataset and an innovative transformer model enables the fast and high-quality creation of animatable avatars. The high generalizability and the diverse application possibilities open up exciting perspectives for the future of 3D modeling and animation.
Bibliography
Zhuang, Y., Lv, J., Wen, H., Shuai, Q., Zeng, A., Zhu, H., Chen, S., Yang, Y., Cao, X., & Liu, W. (2024). IDOL: Instant Photorealistic 3D Human Creation from a Single Image. arXiv preprint arXiv:2412.14963.
Feng, Q., Yang, Y., Lai, Y.-K., & Li, K. (2023). R2Human: Real-Time 3D Human Appearance Rendering from a Single Image. arXiv preprint arXiv:2312.05826.
Zheng, Z., Yu, T., Wei, Y., Dai, Q., & Liu, Y. (2019). DeepHuman: 3D Human Reconstruction from a Single Image. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9755-9764).
.png)


