PartGen Enables Part-Level 3D Modeling from Text and Images

Next-Generation 3D Modeling: Disassembling Objects into Parts with PartGen

The world of 3D modeling is experiencing rapid progress. Text-to-3D generators and 3D scanners now deliver models of impressive quality in form and texture. However, these models usually consist of a single, cohesive representation, whether as an implicit neural field, Gaussian mixture, or mesh. Lack of structure within the models makes editing and adapting individual parts difficult. A novel approach called PartGen addresses this problem and enables the generation of 3D objects consisting of individual, meaningful parts – starting from text, images, or unstructured 3D objects.

How Does PartGen Work?

The PartGen process is based on two main phases. First, a multi-view diffusion model analyzes multiple views of a 3D object, whether generated or rendered. This model extracts plausible and consistent part segmentations across different views, decomposing the object into individual parts. In the second phase, another multi-view diffusion model takes each individual part, completes occluded areas, and uses these completed views for 3D reconstruction. The completion takes into account the context of the entire object to ensure that the parts fit together later. Particularly impressive: The generative completion model can supplement missing information due to occlusions and, in extreme cases, even hallucinate completely invisible parts based on the input object.

Advantages and Applications

Decomposition into individual parts offers numerous advantages. Designers can edit individual components of a 3D model without changing the entire object. The automated generation of variations of an object is also significantly simplified by the part-based representation. For example, automatically adjusting the size or shape of individual parts to create different product variants is conceivable. PartGen thus opens up new possibilities for product design, animation, and virtual reality. Especially for Mindverse as a provider of AI-powered content solutions, PartGen offers enormous potential. The technology could be integrated into the platform to facilitate the creation and editing of complex 3D models for users.

Evaluation and Outlook

PartGen has been evaluated using generated and real 3D objects and significantly outperforms existing segmentation and part extraction methods. The results show that the method produces robust and detailed 3D models consisting of clearly defined parts. Applications such as 3D part editing demonstrate the potential of the approach for creative workflows. The development of PartGen is still in its early stages, but the results so far are promising. Future research could focus on improving segmentation accuracy and expanding the application possibilities. Integrating PartGen into platforms like Mindverse could make 3D modeling accessible to a wider audience and pave the way for innovative applications. In combination with Mindverse's existing AI solutions, such as chatbots, voicebots, and AI search engines, completely new possibilities for content creation could emerge.

Bibliography: Chen, M., Shapovalov, R., Laina, I., Monnier, T., Wang, J., Novotny, D., & Vedaldi, A. (2024). PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models. arXiv preprint arXiv:2412.18608. Yang, Y., Huang, Y., Wu, X., Guo, Y.-C., Zhang, S.-H., Zhao, H., He, T., & Liu, X. (2024). DreamComposer: Controllable 3D Object Generation via Multi-View Conditions. arXiv preprint arXiv:2312.03611. Gao, R., Holynski, A., Henzler, P., Brussee, A., Martin-Brualla, R., Srinivasan, P., Barron, J. T., & Poole, B. (2024). CAT3D: Create Anything in 3D with Multi-View Diffusion Models. arXiv preprint arXiv:2405.10314. CAT3D: Create Anything in 3D with Multi-View Diffusion Models. https://cat3d.github.io/ Holynski, A., Gao, R., Henzler, P., Brussee, A., Martin Brualla, R., Srinivasan, P. P., Barron, J. T., & Poole, B. (2024). CAT3D: Create Anything in 3D with Multi-View Diffusion Models. In NeurIPS. Chen, M. PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models. https://www.youtube.com/watch?v=Ma_Nk85L3d4 ChenWang. Awesome 3D Diffusion. https://github.com/cwchenwang/awesome-3d-diffusion Zheng, X.-Y., Pan, H., Guo, Y.-X., Tong, X., & Liu, Y. (2024). MVD^2: Efficient Multiview 3D Reconstruction for Multiview Diffusion. arXiv preprint arXiv:2402.14253. Bohrium. MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View. https://bohrium.dp.tech/paper/arxiv/2405.03894 ChatPaper. https://www.chatpaper.com/chatpaper/fr?id=4&date=1735056000&page=1 ChatPaper. https://chatpaper.com/chatpaper/ja?id=4&date=1735056000&page=1 ```