Authors:
(1) Han Jiang, HKUST and Equal contribution (hjiangav@connect.ust.hk);
(2) Haosen Sun, HKUST and Equal contribution (hsunas@connect.ust.hk);
(3) Ruoxuan Li, HKUST and Equal contribution (rliba@connect.ust.hk);
(4) Chi-Keung Tang, HKUST (cktang@cs.ust.hk);
(5) Yu-Wing Tai, Dartmouth College, (yu-wing.tai@dartmouth.edu). Table of Links Abstract and 1. Introduction 2. Related Work 2.1. NeRF Editing and 2.2. Inpainting Techniques 2.3. Text-Guided Visual Content Generation 3. Method 3.1. Training View Pre-processing 3.2. Progressive Training 3.3. 4D Extension 4. Experiments and 4.1. Experimental Setups 4.2. Ablation and comparison 5. Conclusion and 6. References 4.2. Ablation and comparison To demonstrate the efficacy of various elements in our baseline design, we compare our baseline to the following 3 variants. As we are the first, to the best of our knowledge, to aim at generative NeRF inpainting, there does not exist baselines for direct comparison. Therefore, we conduct ablation by modifying some parts in our baseline, and conduct comparisons by replacing a subprocess by an existing baseline. View independent inpainting. In our baseline, all training images are pre-processed to be consistent with the first seed image, providing a guarantee for coarse convergence. To show the necessity of our pre-processing strategy, in this baseline, we inpaint all training images independently. Warmup training and IDU are kept unchanged. The results are shown in figure 4. It is clear that independent inpainting results in divergence across views, and IDU fails to produce the initial highly inconsistent image convergence. This explains that the pre-processing step to help convergence is indispensable. Instruct-Nerf2Nerf after warmup. The second stage of our NeRF training is largely based on iterative dataset update, which is the core of instruct-nerf2nerf [4]. This brings out the question of whether instruct-nerf2nerf can accomplish our task. Since instruct-nerf2nerf is mainly for appearance editing without large geometry changes, it is inappropriate to directly run it on our generative inpainting task. Therefore, before running their baseline, we first run our baseline until the end of warmup training, so that we have a coarse geometry as the prerequisite. By this design, the main difference between the two methods is that instruct-nerf2nerf uses instruct-pix2pix, a non-latent diffusion model, to do training image correction. Figure 5 shows its results compared with ours. We can see that the final result maintains the blurriness from the warmup NeRF, and does not achieve a higher level of consistency. Warmup without depth supervision. The goal of warmup training is to provide an initial convergence, especially in geometry, which requires depth loss as supervision. To demonstrate this, we provide an example of warmup training without any depth supervision, and show the rendered depth map in figure 6. Compared with the depth map in our standard setting, this depth map has clear errors within and around the target object. From inconsistencies in geometry, the later fine training stage may suffer from “floater” and inconsistency artifacts. This can be eliminated by our simple planar depth map as guidance, even if it is not accurate. This paper is available on arxiv under CC 4.0 license. Authors: (1) Han Jiang, HKUST and Equal contribution (hjiangav@connect.ust.hk); (2) Haosen Sun, HKUST and Equal contribution (hsunas@connect.ust.hk); (3) Ruoxuan Li, HKUST and Equal contribution (rliba@connect.ust.hk); (4) Chi-Keung Tang, HKUST (cktang@cs.ust.hk); (5) Yu-Wing Tai, Dartmouth College, (yu-wing.tai@dartmouth.edu). Authors: Authors: (1) Han Jiang, HKUST and Equal contribution (hjiangav@connect.ust.hk); (2) Haosen Sun, HKUST and Equal contribution (hsunas@connect.ust.hk); (3) Ruoxuan Li, HKUST and Equal contribution (rliba@connect.ust.hk); (4) Chi-Keung Tang, HKUST (cktang@cs.ust.hk); (5) Yu-Wing Tai, Dartmouth College, (yu-wing.tai@dartmouth.edu). Table of Links Abstract and 1. Introduction Abstract and 1. Introduction 2. Related Work 2.1. NeRF Editing and 2.2. Inpainting Techniques 2.1. NeRF Editing and 2.2. Inpainting Techniques 2.3. Text-Guided Visual Content Generation 2.3. Text-Guided Visual Content Generation 3. Method 3. Method 3.1. Training View Pre-processing 3.1. Training View Pre-processing 3.2. Progressive Training 3.2. Progressive Training 3.3. 4D Extension 3.3. 4D Extension 4. Experiments and 4.1. Experimental Setups 4. Experiments and 4.1. Experimental Setups 4.2. Ablation and comparison 4.2. Ablation and comparison 5. Conclusion and 6. References 5. Conclusion and 6. References 4.2. Ablation and comparison To demonstrate the efficacy of various elements in our baseline design, we compare our baseline to the following 3 variants. As we are the first, to the best of our knowledge, to aim at generative NeRF inpainting, there does not exist baselines for direct comparison. Therefore, we conduct ablation by modifying some parts in our baseline, and conduct comparisons by replacing a subprocess by an existing baseline. View independent inpainting. In our baseline, all training images are pre-processed to be consistent with the first seed image, providing a guarantee for coarse convergence. To show the necessity of our pre-processing strategy, in this baseline, we inpaint all training images independently. Warmup training and IDU are kept unchanged. The results are shown in figure 4. It is clear that independent inpainting results in divergence across views, and IDU fails to produce the initial highly inconsistent image convergence. This explains that the pre-processing step to help convergence is indispensable. In our baseline, all training images are pre-processed to be Instruct-Nerf2Nerf after warmup. The second stage of our NeRF training is largely based on iterative dataset update, which is the core of instruct-nerf2nerf [4]. This brings out the question of whether instruct-nerf2nerf can accomplish our task. Since instruct-nerf2nerf is mainly for appearance editing without large geometry changes, it is inappropriate to directly run it on our generative inpainting task. Therefore, before running their baseline, we first run our baseline until the end of warmup training, so that we have a coarse geometry as the prerequisite. By this design, the main difference between the two methods is that instruct-nerf2nerf uses instruct-pix2pix, a non-latent diffusion model, to do training image correction. Figure 5 shows its results compared with ours. We can see that the final result maintains the blurriness from the warmup NeRF, and does not achieve a higher level of consistency. Instruct-Nerf2Nerf after warmup. Warmup without depth supervision. The goal of warmup training is to provide an initial convergence, especially in geometry, which requires depth loss as supervision. To demonstrate this, we provide an example of warmup training without any depth supervision, and show the rendered depth map in figure 6. Compared with the depth map in our standard setting, this depth map has clear errors within and around the target object. From inconsistencies in geometry, the later fine training stage may suffer from “floater” and inconsistency artifacts. This can be eliminated by our simple planar depth map as guidance, even if it is not accurate. Warmup without depth supervision. This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

NeRF Editing and Inpainting Techniques: Ablation and comparison

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

NeRF Editing and Inpainting Techniques: 4D Extension

NeRF Editing and Inpainting Techniques: NeRF Editing and Inpainting Techniques

NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation

NeRF Editing and Inpainting Techniques: Method

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

NeRF Editing and Inpainting Techniques: 4D Extension

NeRF Editing and Inpainting Techniques: NeRF Editing and Inpainting Techniques

NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation

NeRF Editing and Inpainting Techniques: Method

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps