This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.
Authors:
(1) Akash Chikhalikar, Graduate School of Engineering, Department of Robotics, Tohoku University;
(2) Ankit A. Ravankar, Graduate School of Engineering, Department of Robotics, Tohoku University;
(3) Jose Victorio Salazar Luces, Graduate School of Engineering, Department of Robotics, Tohoku University;
(4) Yasuhisa Hirata, Graduate School of Engineering, Department of Robotics, Tohoku University.
In recent years, the demand for service robots capable of executing tasks beyond autonomous navigation has grown. In the future, service robots will be expected to perform complex tasks like ‘Set table for dinner’. High-level tasks like these, require, among other capabilities, the ability to retrieve multiple targets. This paper delves into the challenge of locating multiple targets in an environment, termed ‘Find my Objects.’ We present a novel heuristic designed to facilitate robots in conducting a preferential search for multiple targets in indoor spaces. Our approach involves a Semantic SLAM framework that combines semantic object recognition with geometric data to generate a multi-layered map. We fuse the semantic maps with probabilistic priors for efficient inferencing. Recognizing the challenges introduced by obstacles that might obscure a navigation goal and render standard point-to-point navigation strategies less viable, our methodology offers resilience to such factors. Importantly, our method is adaptable to various object detectors, RGB-D SLAM techniques, and local navigation planners. We demonstrate the ‘Find my Objects’ task in realworld indoor environments, yielding quantitative results that attest to the effectiveness of our methodology. This strategy can be applied in scenarios where service robots need to locate, grasp, and transport objects, taking into account user preferences.
Service robots, now ubiquitous in domestic environments, are being deployed for diverse array of applications ranging object delivery, patrolling, surveillance, cleaning, and monitoring. Additionally, advancements in techniques such as SLAM (simultaneous Localization and Mapping), deep learning architectures, path planning algorithms, and object manipulation have accelerated recent development [1]. Their integration spans a spectrum of use-cases, from automating repetitive household chores to specialized long-term care applications for the elderly [2], [3]. However, one fundamental skill that all service robots require is the ability to retrieve items effectively.
Relying solely on exhaustive exploration or random navigation can be energy-intensive and may not meet time requirements. Therefore, it is essential to harness diverse information sources to refine robotic object search. Semantic SLAM can provide significant support in this regard. Semantic SLAM involves the extraction and integration of semantic understanding with geometric data to produce detailed, multi-layered maps. These maps not only identify landmarks like furniture and appliances typical in a house setting but also classify areas such as living rooms [4], [5]. In addition, state-of-the-art semantic SLAM approaches incorporate beliefs about dynamic entities such as other agents or humans [6].
With the evolution of deep neural networks, extracting semantic information has become more streamlined. Convolutional Neural Networks (CNN) like Yolov7 [7] facilitate real-time object detection, while models like MaskRCNN [8] enable detailed instance segmentation. By leveraging semantic SLAM and deep learning, service robots can enhance object search capabilities.
Despite significant advancements in robotics, robots today still lack the intuitive scene-awareness that humans naturally possess. For instance, while a robot can identify a kitchen and a cup, it should also understand that ‘cups are typically found in kitchens’. This intuitive knowledge and place object associations can be captured using ontologies [9], [10]. Therefore, blending information from probabilistic priors with semantic maps is crucial for an efficient search.
In order to execute further higher level tasks such as ‘Set table for dinner’ or ‘Prepare bag for work’, robots will have to efficiently plan and retrieve multiple items in the environment. Given these considerations, our research aims to address multi-target search in indoor settings, termed the ‘Find my Objects’ task. We have designed a framework that can efficiently explore semantic maps to look for commonly occurring items in daily life. The main contributions of this paper include:
• Defining a multi-target search challenge in indoor settings with adaptable region-to-region navigation.
• A framework to integrate probabilistic priors with multilayered semantic maps for search.
• A novel heuristic to prioritize targets based on user input/demands.
• Quantitative results obtained in real environments and comparisons of our proposed heuristic with baselines for object search.
The remainder of the paper is structured as follows. Section II reviews related work and highlights our contributions. Section III details our methodology to create a multi-layered semantic map. Section IV discusses the multi-target search algorithm. Section V presents our experiments, quantitative results, and analysis of the observed data. Finally, Section VI concludes the paper with a discussion.