paint-brush
Defining Open-Vocabulary Segmentation: Problem Setup, Baseline, and the Uni-OVSeg Frameworkby@segmentation

Defining Open-Vocabulary Segmentation: Problem Setup, Baseline, and the Uni-OVSeg Framework

by SegmentationNovember 12th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In this section, we define open-vocabulary segmentation as the task of segmenting images into masks linked to semantic categories not seen during training. We then introduce a baseline approach and present our proposed Uni-OVSeg framework, which tackles the challenge of segmenting novel categories using natural language representations of test categories.
featured image - Defining Open-Vocabulary Segmentation: Problem Setup, Baseline, and the Uni-OVSeg Framework
Segmentation HackerNoon profile picture

Authors:

(1) Zhaoqing Wang, The University of Sydney and AI2Robotics;

(2) Xiaobo Xia, The University of Sydney;

(3) Ziye Chen, The University of Melbourne;

(4) Xiao He, AI2Robotics;

(5) Yandong Guo, AI2Robotics;

(6) Mingming Gong, The University of Melbourne and Mohamed bin Zayed University of Artificial Intelligence;

(7) Tongliang Liu, The University of Sydney.

Abstract and 1. Introduction

2. Related works

3. Method and 3.1. Problem definition

3.2. Baseline and 3.3. Uni-OVSeg framework

4. Experiments

4.1. Implementation details

4.2. Main results

4.3. Ablation study

5. Conclusion

6. Broader impacts and References


A. Framework details

B. Promptable segmentation

C. Visualisation

3. Method

In this section, we first define the problem of openvocabulary segmentation in Sec. 3.1. We then introduce a straightforward baseline in Sec. 3.2. Finally, we present our proposed Uni-OVSeg framework in Sec. 3.3, including an overview, mask generation, mask-text alignment, and open-vocabulary inference.

3.1. Problem definition


This paper is available on arxiv under CC BY 4.0 DEED license.