Authors:
(1) Luyuan Peng, Acoustic Research Laboratory, National University of Singapore;
(2) Hari Vishnu, Acoustic Research Laboratory, National University of Singapore;
(3) Mandar Chitre, Acoustic Research Laboratory, National University of Singapore;
(4) Yuen Min Too, Acoustic Research Laboratory, National University of Singapore;
(5) Bharath Kalyan, Acoustic Research Laboratory, National University of Singapore;
(6) Rajat Mishra, Acoustic Research Laboratory, National University of Singapore. Table of Links I Introduction II Method III Datasets IV Experiments, Acknowledgment, and References I. INTRODUCTION Visual localization is a potential solution for the problem of localization in a known underwater environment for inspection. Inspection missions may often involve operations around marine structures, making acoustic navigation with beacons difficult due to shadowing and multipath [1]. As the vehicle has to operate close to structures, inertial navigation systems, which accumulate errors with time, may not provide sufficient positioning accuracy [1]. In comparison, visual localization using cameras may offer a cost-effective, consistent and accurate alternative in such missions. Previous work has shown that machine learning-based regression methods based on PoseNet [3], can effectively regress a 6-degree-offreedom (DOF) pose from a single 224×224 RGB image with approximately 6 cm position accuracy and 1.7°orientation accuracy when tested on simulated underwater datasets [2]. It was also shown that using a deeper neural network as the extractor may improve the model’s localization accuracy [2]. This work further investigates the effectiveness of such models on underwater datasets and also explores different techniques to further improve localization performance. We have three main contributions: We explore the use of long-short-term memory (LSTM) [4] in the pose regression model to exploit spatial correlation of the image features and to achieve more structured dimensionality reduction [5].


We test the proposed models on underwater datasets collected from a 1.6 m × 1 m × 1 m water-filled tank using a remotely operated vehicle (ROV). The tank offers an environment where we can control lighting and turbidity. The models are able to achieve good accuracy in these datasets, with performance comparable to that obtained with the simulator dataset.


The base dataset consist of images taken from the first camera of a stereo camera mounted on the vehicle. Furthermore, we explore the performance improvement obtained by augmenting the data with additional images from the second camera. Fig. 2 shows some examples of underwater scenes in the tank dataset. This paper is available on arxiv under CC BY 4.0 DEED license. Authors: (1) Luyuan Peng, Acoustic Research Laboratory, National University of Singapore; (2) Hari Vishnu, Acoustic Research Laboratory, National University of Singapore; (3) Mandar Chitre, Acoustic Research Laboratory, National University of Singapore; (4) Yuen Min Too, Acoustic Research Laboratory, National University of Singapore; (5) Bharath Kalyan, Acoustic Research Laboratory, National University of Singapore; (6) Rajat Mishra, Acoustic Research Laboratory, National University of Singapore. Authors: Authors: (1) Luyuan Peng, Acoustic Research Laboratory, National University of Singapore; (2) Hari Vishnu, Acoustic Research Laboratory, National University of Singapore; (3) Mandar Chitre, Acoustic Research Laboratory, National University of Singapore; (4) Yuen Min Too, Acoustic Research Laboratory, National University of Singapore; (5) Bharath Kalyan, Acoustic Research Laboratory, National University of Singapore; (6) Rajat Mishra, Acoustic Research Laboratory, National University of Singapore. Table of Links I Introduction I Introduction II Method II Method III Datasets III Datasets IV Experiments, Acknowledgment, and References IV Experiments, Acknowledgment, and References I. INTRODUCTION Visual localization is a potential solution for the problem of localization in a known underwater environment for inspection. Inspection missions may often involve operations around marine structures, making acoustic navigation with beacons difficult due to shadowing and multipath [1]. As the vehicle has to operate close to structures, inertial navigation systems, which accumulate errors with time, may not provide sufficient positioning accuracy [1]. In comparison, visual localization using cameras may offer a cost-effective, consistent and accurate alternative in such missions. Previous work has shown that machine learning-based regression methods based on PoseNet [3], can effectively regress a 6-degree-offreedom (DOF) pose from a single 224×224 RGB image with approximately 6 cm position accuracy and 1.7°orientation accuracy when tested on simulated underwater datasets [2]. It was also shown that using a deeper neural network as the extractor may improve the model’s localization accuracy [2]. This work further investigates the effectiveness of such models on underwater datasets and also explores different techniques to further improve localization performance. We have three main contributions: We explore the use of long-short-term memory (LSTM) [4] in the pose regression model to exploit spatial correlation of the image features and to achieve more structured dimensionality reduction [5]. We test the proposed models on underwater datasets collected from a 1.6 m × 1 m × 1 m water-filled tank using a remotely operated vehicle (ROV). The tank offers an environment where we can control lighting and turbidity. The models are able to achieve good accuracy in these datasets, with performance comparable to that obtained with the simulator dataset. The base dataset consist of images taken from the first camera of a stereo camera mounted on the vehicle. Furthermore, we explore the performance improvement obtained by augmenting the data with additional images from the second camera. Fig. 2 shows some examples of underwater scenes in the tank dataset. We explore the use of long-short-term memory (LSTM) [4] in the pose regression model to exploit spatial correlation of the image features and to achieve more structured dimensionality reduction [5]. We explore the use of long-short-term memory (LSTM) [4] in the pose regression model to exploit spatial correlation of the image features and to achieve more structured dimensionality reduction [5]. We test the proposed models on underwater datasets collected from a 1.6 m × 1 m × 1 m water-filled tank using a remotely operated vehicle (ROV). The tank offers an environment where we can control lighting and turbidity. The models are able to achieve good accuracy in these datasets, with performance comparable to that obtained with the simulator dataset. We test the proposed models on underwater datasets collected from a 1.6 m × 1 m × 1 m water-filled tank using a remotely operated vehicle (ROV). The tank offers an environment where we can control lighting and turbidity. The models are able to achieve good accuracy in these datasets, with performance comparable to that obtained with the simulator dataset. The base dataset consist of images taken from the first camera of a stereo camera mounted on the vehicle. Furthermore, we explore the performance improvement obtained by augmenting the data with additional images from the second camera. Fig. 2 shows some examples of underwater scenes in the tank dataset. The base dataset consist of images taken from the first camera of a stereo camera mounted on the vehicle. Furthermore, we explore the performance improvement obtained by augmenting the data with additional images from the second camera. Fig. 2 shows some examples of underwater scenes in the tank dataset. This paper is available on arxiv under CC BY 4.0 DEED license. This paper is available on arxiv under CC BY 4.0 DEED license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Underwater Visual Localization Using Machine Learning and LSTM: Introduction

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Underwater Visual Localization Using Machine Learning and LSTM: Datasets

10 Security Products to Protect Your Smart Home

10 Must-Try Open Source Tools for Machine Learning

10 Computer Vision Startups on Product Hunt with the Most Upvotes

10 Biggest Image Datasets for Computer Vision

10 Best Image Classification Datasets for ML Projects

Underwater Visual Localization Using Machine Learning and LSTM: Datasets

10 Security Products to Protect Your Smart Home

10 Must-Try Open Source Tools for Machine Learning

10 Computer Vision Startups on Product Hunt with the Most Upvotes

10 Biggest Image Datasets for Computer Vision

10 Best Image Classification Datasets for ML Projects

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps