ReLoc-PDR: A Robust Smartphone System for Indoor Pedestrian Positioning

Authors:

(1) Zongyang Chen, College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China ([email protected]);

(2) Xianfei Pan, College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China ([email protected]);

(3) Changhao Chen, College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China ([email protected]).

Table of Links

Abstract and 1. Introduction

II. Related Work

III. Visual Relocalization Enhanced Pedestrian Dead Reckoning

IV. Experiments

V. Conclusion and References

Abstract— Accurately and reliably positioning pedestrians in satellite-denied conditions remains a significant challenge. Pedestrian dead reckoning (PDR) is commonly employed to estimate pedestrian location using low-cost inertial sensor. However, PDR is susceptible to drift due to sensor noise, incorrect step detection, and inaccurate stride length estimation. This work proposes ReLoc-PDR, a fusion framework combining PDR and visual relocalization using graph optimization. ReLoc-PDR leverages time-correlated visual observations and learned descriptors to achieve robust positioning in visually-degraded environments. A graph optimization-based fusion mechanism with the Tukey kernel effectively corrects cumulative errors and mitigates the impact of abnormal visual observations. Real-world experiments demonstrate that our ReLoc-PDR surpasses representative methods in accuracy and robustness, achieving accurte and robust pedestrian positioning results using only a smartphone in challenging environments such as less-textured corridors and dark nighttime scenarios.

I. INTRODUCTION

ROBUST and accurate indoor pedestrian navigation plays a crucial role in enabling various location-based services (LBS). The determination of pedestrian location in satellite-denied environments is a fundamental requirement for numerous applications, including emergency rescue operations, path guidance systems, and augmented reality experiences [1], [2]. The existing indoor positioning solutions relying on the deployment of dedicated infrastructures are susceptible to signal interference and non-line-of-sight (NLOS) conditions, and their widespread deployment can be prohibitively expensive.

In contrast to infrastructure-based positioning methods, pedestrian dead reckoning (PDR) relies on only inertial data to estimate pedestrian location, providing a relatively robust approach in various environments. However, due to the inherent noise in inertial sensors, PDR is susceptible to trajectory drift over long-term positioning. To mitigate this issue, researchers have explored the combination of PDR with additional positioning methods, such as Ultra-Wideband (UWB), WiFi, Bluetooth, among others [3]–[5]. These methods have demonstrated impressive results in indoor localization. However, they often necessitate the presence of extra infrastructure and require rigorous calibration procedures in advance.

The pursuit of a low-cost, robust, and self-contained positioning system is of great importance to flexible and resiliant pedestrian navigation. Visual relocalization, which estimates the 6 degree-of-freedom (DoF) pose of a query image against an existing 3D map model, holds potential for achieving driftfree global localization using only a camera. Considering that smartphones commonly are with built-in cameras, utilizing image-based localization methods on mobile devices becomes a viable approach. However, the challenge lies in the susceptibility of image-based relocalization to environmental factors such as changes in lighting conditions and scene dynamics. Existing 3D structure-based visual localization methods [6]– [9] are computationally demanding and fail to provide realtime pose output. Furthermore, the positioning accuracy of visual relocalization significantly deteriorates in challenging environments due to the scarcity of recognizable features.

Considering the continuity and autonomy advantages of PDR, combining visual relocalization with PDR serves as a complementary approach. This integration allows for the correction of accumulated errors in PDR using visual relocalization results, while PDR improves the continuity and real-time performance of trajectory estimation. Several recent studies have begun exploring this direction [10]–[13]. Existing methods primarily employ dynamic weighting strategies to loosely integrate PDR with visual relocalization. However, these approaches may lack robustness in challenging environments and can result in significant trajectory inconsistencies due to the interference caused by abnormal visual observations

To tackle these challenges, this work presents ReLoc-PDR, a robust framework for pedestrian inertial positioning aided by visual relocalization. ReLoc-PDR leverages recent advancements in deep learning-based feature extraction [14] and graph optimization [15] to ensure reliable visual feature matching and robust localization. By integrating these techniques, our method effectively mitigates the risk of visual relocalization failure, enhancing the system’s robustness in visually degraded environments. To fuse the pose results from PDR and visual relocalization effectively, we design a graph optimization-based fusion mechanism using the Tukey kernel. This mechanism facilitates cumulative error correction and eliminates the impact of abnormal visual observations on positioning accuracy. As a result, the ReLoc-PDR system exhibits stability and reliability. Real-world experiments were conducted to evaluate the performance of the proposed method. The results demonstrate the efficacy of our approach in various challenging environments, including texture-less areas and nighttime outdoor scenarios.

Our contributions can be summarized as follows:

• We propose ReLoc-PDR, a robust pedestrian positioning system that effectively integrates Pedestrian Dead Reckoning (PDR) and visual relocalization to mitigate positioning drifts.

• We design a robust visual relocalization pipeline that leverages learned global descriptors for image retrieval and learned local feature matching. It enhances the robustness of the positioning system, particularly in visually degraded scenes where traditional methods may struggle.

• We introduce a pose fusion mechanism by incorporating the Tukey kernel into graph optimization that facilitates cumulative error correction. It effectively eliminates the impact of abnormal visual observations on positioning accuracy, ensuring the stability and reliability of the ReLoc-PDR system.

A. Pedestrian Dead Reckoning

Pedestrian dead reckoning (PDR) relies on measurements obtained from the built-in sensors of smartphones to detect human gait events and estimate step length and heading angle for pedestrian positioning [1], [16]. However, PDR alone is prone to cumulative error over time, resulting in inaccurate positioning, particularly over long distances. To address this limitation, previous studies have attempted to integrate PDR with other absolute localization technologies such as WiFi, Bluetooth, and Ultra-Wideband (UWB). These integration approaches aim to periodically correct the accumulated error of PDR by incorporating absolute position information [3]– [5]. Despite their potential benefits, these methods are heavily reliant on infrastructure availability and can be susceptible to changes in the physical environment. Factors such as variations in signal strength, infrastructure coverage, and environmental conditions may impact the accuracy and reliability of these infrastructure-dependent approaches.

The integration of pedestrian dead reckoning (PDR) with visual relocalization has emerged as a promising research direction for achieving self-contained and highly accurate positioning using smartphones, without the need for additional infrastructure. This research area has gained increasing attention in recent years. Elloumi et al. [17] conducted a comparative study between inertial and vision-based methods for indoor pedestrian localization using smartphones. However, they did not explore the integration of these approaches to further enhance the performance of the localization system. In another study, a refined indoor localization framework was proposed in [10], which combines image-based localization with PDR to improve pedestrian trajectory estimation. Wang et al. [18] introduced a vision-aided PDR localization system that integrates visual and PDR measurements into a unified graph model. Furthermore, a novel indoor localization method called V-PDR was proposed in [11], which integrates image retrieval and PDR using a weighted average strategy. This approach successfully reduces the accumulated error of PDR. Shu et al. [12] employed a dynamic fusion strategy that integrates PDR and image-based localization (IBL) based on the number of inliers in the IBL process. Their approach enables continuous and accurate 3D location estimation for long-term tracking using smartphones. Additionally, in [13], a multimodal fusion algorithm was proposed that loosely couples PDR with visual localization to correct cumulative errors in PDR results. However, despite these advancements, the robustness of these approaches remains limited, and the positioning accuracy may significantly degrade in visually challenging environments. Further improvements are necessary to ensure reliable and accurate positioning in such scenarios.

B. Visual Relocalization

Visual relocalization, also known as image-based localization, refers to the task of estimating the precise 6 DOF camera pose of a query image within a known map. This task can be approached using retrieval-based and structure-based methods. Retrieval-based approaches [19]–[22] estimate the pose of the query image by leveraging the geolocation of the most similar image retrieved from an image database. However, these methods often fall short in terms of localization accuracy. Structurebased methods [6]–[10], rely on establishing correspondences between 2D features in the query image and 3D points in a Structure from Motion (SFM) model using local descriptors. To handle large-scale scenes, these methods typically employ image retrieval as an initial step, restricting the 2D3D matching to the visible portion of the query image [8], [9]. However, the robustness of traditional localization methods is limited due to the insufficient invariance of handcrafted local features. In recent years, CNN-based local features [14], [23], [24] have exhibited impressive robustness against illumination variations and viewpoint changes. These features have been employed to enhance localization accuracy. For example, [8] presents a comprehensive pipeline for structure-based visual relocalization, incorporating global image retrieval, local feature matching, and pose estimation. However, this method may

encounter challenges during the image retrieval stage in scenes with significant appearance variations, as the representation power of global features is limited. Additionally, these methods often yield a low-frequency global position estimation and require high-performance computing platforms.

This paper is available on arxiv under CC BY 4.0 DEED license.

ReLoc-PDR: A Robust Smartphone System for Indoor Pedestrian Positioning

Table of Links

I. INTRODUCTION

II. RELATED WORK