Table of Links Abstract and 1 Introduction 2 Related Works 3 Method and 3.1 Proxy-Guided 3D Conditioning for Diffusion 3.2 Interactive Generation Workflow and 3.3 Volume Conditioned Reconstruction 4 Experiment and 4.1 Comparison on Proxy-based and Image-based 3D Generation 4.2 Comparison on Controllable 3D Object Generation, 4.3 Interactive Generation with Part Editing & 4.4 Ablation Studies 5 Conclusions, Acknowledgments, and References SUPPLEMENTARY MATERIAL A. Implementation Details B. More Discussions C. More Experiments B MORE DISCUSSIONS B.1 Necessity of Proxy-guided 3D Generation For personalized generation demands, we think only using text / images is insufficient and also unintuitive for expressing 3D structures of objects and their spatial relationships. Hence, granting system 3D-aware controllability with 3D proxy is necessary for 3D generation. As for the acquisition of 3D proxies, we believe this is not an obstacle for target users, as it can be assembled easily using kids’ software like Tinkercad, taken from 3D modeling games from SteamVR, or using LLM+procedural modeling instructions. Similarly, ControlNet uses control images from raw sketches to delicate line art, which also requires basic painting skills. B.2 More Limitations First, the resolution of 3D-aware control is bounded by the size of the proxy feature volume, which cannot fully leverage control from complex high-poly models. For example, we cannot generate a large-scale urban scene with satisfactory building details. Second, our method requires manual tuning control strength to balance between over-constrained and under-constrained, which is also similar to ControlNet [Zhang et al. 2023] as the control strength mainly depends on the creators’ aesthetic choices. Authors:
(1) Wenqi Dong, from Zhejiang University, and conducted this work during his internship at PICO, ByteDance;
(2) Bangbang Yang, from ByteDance contributed equally to this work together with Wenqi Dong;
(3) Lin Ma, ByteDance;
(4) Xiao Liu, ByteDance;
(5) Liyuan Cui, Zhejiang University;
(6) Hujun Bao, Zhejiang University;
(7) Yuewen Ma, ByteDance;
(8) Zhaopeng Cui, a Corresponding author from Zhejiang University. This paper is available on arxiv under CC BY 4.0 DEED license. Table of Links Abstract and 1 Introduction Abstract and 1 Introduction 2 Related Works 2 Related Works 3 Method and 3.1 Proxy-Guided 3D Conditioning for Diffusion 3 Method and 3.1 Proxy-Guided 3D Conditioning for Diffusion 3.2 Interactive Generation Workflow and 3.3 Volume Conditioned Reconstruction 3.2 Interactive Generation Workflow and 3.3 Volume Conditioned Reconstruction 4 Experiment and 4.1 Comparison on Proxy-based and Image-based 3D Generation 4 Experiment and 4.1 Comparison on Proxy-based and Image-based 3D Generation 4.2 Comparison on Controllable 3D Object Generation, 4.3 Interactive Generation with Part Editing & 4.4 Ablation Studies 4.2 Comparison on Controllable 3D Object Generation, 4.3 Interactive Generation with Part Editing & 4.4 Ablation Studies 5 Conclusions, Acknowledgments, and References 5 Conclusions, Acknowledgments, and References SUPPLEMENTARY MATERIAL SUPPLEMENTARY MATERIAL A. Implementation Details A. Implementation Details B. More Discussions B. More Discussions C. More Experiments C. More Experiments B MORE DISCUSSIONS B.1 Necessity of Proxy-guided 3D Generation For personalized generation demands, we think only using text / images is insufficient and also unintuitive for expressing 3D structures of objects and their spatial relationships. Hence, granting system 3D-aware controllability with 3D proxy is necessary for 3D generation. As for the acquisition of 3D proxies, we believe this is not an obstacle for target users, as it can be assembled easily using kids’ software like Tinkercad, taken from 3D modeling games from SteamVR, or using LLM+procedural modeling instructions. Similarly, ControlNet uses control images from raw sketches to delicate line art, which also requires basic painting skills. B.2 More Limitations First, the resolution of 3D-aware control is bounded by the size of the proxy feature volume, which cannot fully leverage control from complex high-poly models. For example, we cannot generate a large-scale urban scene with satisfactory building details. Second, our method requires manual tuning control strength to balance between over-constrained and under-constrained, which is also similar to ControlNet [Zhang et al. 2023] as the control strength mainly depends on the creators’ aesthetic choices. Authors: (1) Wenqi Dong, from Zhejiang University, and conducted this work during his internship at PICO, ByteDance; (2) Bangbang Yang, from ByteDance contributed equally to this work together with Wenqi Dong; (3) Lin Ma, ByteDance; (4) Xiao Liu, ByteDance; (5) Liyuan Cui, Zhejiang University; (6) Hujun Bao, Zhejiang University; (7) Yuewen Ma, ByteDance; (8) Zhaopeng Cui, a Corresponding author from Zhejiang University. Authors: Authors: (1) Wenqi Dong, from Zhejiang University, and conducted this work during his internship at PICO, ByteDance; (2) Bangbang Yang, from ByteDance contributed equally to this work together with Wenqi Dong; (3) Lin Ma, ByteDance; (4) Xiao Liu, ByteDance; (5) Liyuan Cui, Zhejiang University; (6) Hujun Bao, Zhejiang University; (7) Yuewen Ma, ByteDance; (8) Zhaopeng Cui, a Corresponding author from Zhejiang University. This paper is available on arxiv under CC BY 4.0 DEED license. This paper is available on arxiv under CC BY 4.0 DEED license. available on arxiv available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Coin3D Enhances 3D Generation with Proxy-Guided Control but Faces Challenges

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

Coin3D Achieves Superior Control and Efficiency in 3D Generation

Coin3D Optimizes Training and Evaluation for High-Fidelity 3D Generation

Coin3D Enables Interactive and Controllable 3D Asset Generation

Coin3D Achieves Superior Control and Efficiency in 3D Generation

Coin3D Introduces a New Standard for Interactive 3D Asset Generation

Coin3D Achieves Superior Control and Efficiency in 3D Generation

Coin3D Optimizes Training and Evaluation for High-Fidelity 3D Generation

Coin3D Enables Interactive and Controllable 3D Asset Generation

Coin3D Achieves Superior Control and Efficiency in 3D Generation

Coin3D Introduces a New Standard for Interactive 3D Asset Generation

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps