2. Related Works
2.1. 2D Diffusion Models for 3D Generation
2.2. 3D Generative Models and 2.3. Multi-view Diffusion Models
3. Problem Formulation
3.2. The Distribution of 3D Assets
4. Method and 4.1. Consistent Multi-view Generation
5. Experiments
5.4. Single View Reconstruction
5.5. Novel View Synthesis and 5.6. Discussions
6. Conclusions and Future Works, Acknowledgements and References
We evaluate the quality of the reconstructed geometry of different methods. The quantitative results are summarized in Table 1, and the qualitative comparisons are presented in Fig. 6. Shap-E [25] tends to produce incomplete and distorted meshes. SyncDreamer [33] generates shapes that are roughly aligned with the input image but lack detailed geometries, and the texture quality is subpar. One-2-3- 45 [30] attempts to reconstruct meshes from the multiviewinconsistent outputs of Zero123 [31]. While it can capture coarse geometries, it loses important details in the process. In comparison, our method stands out by achieving the highest reconstruction quality, both in terms of geometry and textures.
This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.
Authors:
(1) Xiaoxiao Long, The University of Hong Kong, VAST, MPI Informatik and Equal Contributions;
(2) Yuan-Chen Guo, Tsinghua University, VAST and Equal Contributions;
(3) Cheng Lin, The University of Hong Kong with Corresponding authors;
(4) Yuan Liu, The University of Hong Kong;
(5) Zhiyang Dou, The University of Hong Kong;
(6) Lingjie Liu, University of Pennsylvania;
(7) Yuexin Ma, Shanghai Tech University;
(8) Song-Hai Zhang, The University of Hong Kong;
(9) Marc Habermann, MPI Informatik;
(10) Christian Theobalt, MPI Informatik;
(11) Wenping Wang, Texas A&M University with Corresponding authors.