Authors:
(1) Kun Lan, University of Science and Technology of China;
(2) Haoran Li, University of Science and Technology of China;
(3) Haolin Shi, University of Science and Technology of China;
(4) Wenjun Wu, University of Science and Technology of China;
(5) Yong Liao, University of Science and Technology of China;
(6) Lin Wang, AI Thrust, HKUST(GZ);
(7) Pengyuan Zhou, University of Science and Technology of China.
3. Method and 3.1. Point-Based rendering and Semantic Information Learning
3.2. Gaussian Clustering and 3.3. Gaussian Filtering
4. Experiment
4.1. Setups, 4.2. Result and 4.3. Ablations
We propose a 3D Gaussian segmentation method guided by 2D segmentation maps, attaching a probability distribution vector for each 3D Gaussian on various categories to enable the segmentation of the majority of 3D Gaussians in the scene. Meanwhile, we employ KNN clustering to utilize the spatial continuity of objects, ensuring that nearby 3D Gaussians belong to the same category. Additionally, optional statistical filtering is used to help remove those 3D Gaussians that are incorrectly segmented. As an initial step in 3D understanding and editing, this method has a wide range of potential applications in downstream tasks. We demonstrate the effectiveness of our method on common NeRF datasets.
[1] Bernhard Kerbl, Georgios Kopanas, Thomas Leimkuhler, and George Drettakis, “3D Gaussian ¨ Splatting for Real-Time Radiance Field Rendering,” ACM Transactions on Graphics, vol. 42, no. 4, pp. 1–14, July 2023.
[2] Weiping Liu, Jia Sun, Wanyi Li, Ting Hu, and Peng Wang, “Deep learning on point clouds and its application: A survey,” Sensors, vol. 19, no. 19, 2019.
[3] Dawar Khan, Alexander Plopski, Yuichiro Fujimoto, Masayuki Kanbara, Gul Jabeen, Yongjie Jessica Zhang, Xiaopeng Zhang, and Hirokazu Kato, “Surface remeshing: A systematic literature review of methods and research directions,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 3, pp. 1680–1713, 2022.
[4] Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove, “Deepsdf: Learning continuous signed distance functions for shape representation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
[5] Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Commun. ACM, vol. 65, no. 1, pp. 99–106, dec 2021.
[6] Johannes L. Schonberger and Jan-Michael Frahm, “Structure-from-motion revisited,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016. [7] Mingqiao Ye, Martin Danelljan, Fisher Yu, and Lei Ke, “Gaussian Grouping: Segment and Edit Anything in 3D Scenes,” arXiv e-prints, p. arXiv:2312.00732, Dec. 2023.
[8] Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, and Qi Tian, “Segment Any 3D Gaussians,” arXiv e-prints, p. arXiv:2312.00860, Dec. 2023.
[9] Sosuke Kobayashi, Eiichi Matsumoto, and Vincent Sitzmann, “Decomposing nerf for editing via feature field distillation,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds. 2022, vol. 35, pp. 23311–23330, Curran Associates, Inc.
[10] Rahul Goel, Dhawal Sirikonda, Saurabh Saini, and P. J. Narayanan, “Interactive segmentation of radiance fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 4201–4211.
[11] Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, and Gang Zeng, “DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation,” arXiv eprints, p. arXiv:2309.16653, Sept. 2023.
[12] Taoran Yi, Jiemin Fang, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Qi Tian, and Xinggang Wang, “GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models,” arXiv e-prints, p. arXiv:2310.08529, Oct. 2023.
[13] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer, “High-resolution im- ¨ age synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10684–10695.
[14] Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang, “4D Gaussian Splatting for RealTime Dynamic Scene Rendering,” arXiv e-prints, p. arXiv:2310.08528, Oct. 2023.
[15] Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick, “Segment Anything,” ´ arXiv e-prints, p. arXiv:2304.02643, Apr. 2023.
[16] Bing Wang, Lu Chen, and Bo Yang, “DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images,” arXiv e-prints, p. arXiv:2208.07227, Aug. 2022.
[17] Bangbang Yang, Yinda Zhang, Yinghao Xu, Yijin Li, Han Zhou, Hujun Bao, Guofeng Zhang, and Zhaopeng Cui, “Learning object-compositional neural radiance field for editable scene rendering,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021, pp. 13779–13788.
[18] Youtan Yin, Zhoujie Fu, Fan Yang, and Guosheng Lin, “OR-NeRF: Object Removing from 3D Scenes Guided by Multiview Segmentation with Neural Radiance Fields,” arXiv e-prints, p. arXiv:2305.10503, May 2023.
[19] Jesus Zarzar, Sara Rojas, Silvio Giancola, and Bernard Ghanem, “SegNeRF: 3D Part Segmentation with Neural Radiance Fields,” arXiv e-prints, p. arXiv:2211.11215, Nov. 2022.
[20] Ashkan Mirzaei, Tristan Aumentado-Armstrong, Konstantinos G. Derpanis, Jonathan Kelly, Marcus A. Brubaker, Igor Gilitschenski, and Alex Levinshtein, “Spin-nerf: Multiview segmentation and perceptual inpainting with neural radiance fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 20669–20679.
[21] Zhenxing MI and Dan Xu, “Switch-neRF: Learning scene decomposition with mixture of experts for largescale neural radiance fields,” in The Eleventh International Conference on Learning Representations, 2023.
[22] Mathilde Caron, Hugo Touvron, Ishan Misra, Herve´ Jegou, Julien Mairal, Piotr Bojanowski, and Armand ´ Joulin, “Emerging properties in self-supervised vision transformers,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021, pp. 9650–9660.
[23] Yuying Hao, Yi Liu, Yizhou Chen, Lin Han, Juncai Peng, Shiyu Tang, Guowei Chen, Zewu Wu, Zeyu Chen, and Baohua Lai, “Eiseg: An efficient interactive segmentation tool based on paddlepaddle,” arXiv e-prints, pp. arXiv–2210, 2022.
[24] Ben Mildenhall, Pratul P. Srinivasan, Rodrigo OrtizCayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar, “Local light field fusion: Practical view synthesis with prescriptive sampling guidelines,” ACM Trans. Graph., vol. 38, no. 4, jul 2019.
[25] Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman, “Mip-nerf 360: Unbounded anti-aliased neural radiance fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 5470–5479.
This paper is available on arxiv under CC 4.0 license.