Sidescan sonar is a small and cost-effective sensing solution that can be easily mounted on most vessels. Historically, it has been used to produce high-definition images that experts may use to identify targets on the seafloor or in the water column. While solutions have been proposed to produce bathymetry solely from sidescan, or in conjunction with multibeam, they have had limited impact. This is partly a result of mostly being limited to single sidescan lines. In this paper, we propose a modern, salable solution to create high quality survey-scale bathymetry from many sidescan lines. By incorporating multiple observations of the same place, results can be improved as the estimates reinforce each other. Our method is based on sinusoidal representation networks, a recent advance in neural representation learning. We demonstrate the scalability of the approach by producing bathymetry from a large sidescan survey. The resulting quality is demonstrated by comparing to data collected with a high-precision multibeam sensor.
Recent advances in differentiable rendering, which allow calculating the gradients of 2D pixel values with respect to 3D object models, can be applied to estimation of the model parameters by gradient-based optimization with only 2D supervision. It is easy to incorporate deep neural networks into such an optimization pipeline, allowing the leveraging of deep learning techniques. This also largely reduces the requirement for collecting and annotating 3D data, which is very difficult for applications, for example when constructing geometry from 2D sensors. In this work, we propose a differentiable renderer for sidescan sonar imagery. We further demonstrate its ability to solve the inverse problem of directly reconstructing a 3D seafloor mesh from only 2D sidescan sonar data.
Sidescan sonar intensity encodes information about the changes of surface normal of the seabed. However, other factors such as seabed geometry as well as its material composition also affect the return intensity. One can model these intensity changes in a forward direction from the surface normals from bathymetric map and physical properties to the measured intensity or alternatively one can use an inverse model which starts from the intensities and models the surface normals. Here we use an inverse model which leverages deep learning's ability to learn from data; a convolutional neural network is used to estimate the surface normal from the sidescan. Thus the internal properties of the seabed are only implicitly learned. Once this information is estimated, a bathymetric map can be reconstructed through an optimization framework that also includes altimeter readings to provide a sparse depth profile as a constraint. Implicit neural representation learning was recently proposed to represent the bathymetric map in such an optimization framework. In this article, we use a neural network to represent the map and optimize it under constraints of altimeter points and estimated surface normal from sidescan. By fusing multiple observations from different angles from several sidescan lines, the estimated results are improved through optimization. We demonstrate the efficiency and scalability of the approach by reconstructing a high-quality bathymetry using sidescan data from a large sidescan survey. We compare the proposed data-driven inverse model approach of modeling a sidescan with a forward Lambertian model. We assess the quality of each reconstruction by comparing it with data constructed from a multibeam sensor. We are thus able to discuss the strengths and weaknesses of each approach.
We propose a novel data-driven approach for high-resolution bathymetric reconstruction from sidescan. Sidescan sonar (SSS) intensities as a function of range do contain some information about the slope of the seabed. However, that information must be inferred. Additionally, the navigation system provides the estimated trajectory, and normally the altitude along this trajectory is also available. From these we obtain a very coarse seabed bathymetry as an input. This is then combined with the indirect but high-resolution seabed slope information from the sidescan to estimate the full bathymetry. This sparse depth could be acquired by single-beam echo sounder, Doppler Velocity Log (DVL), other bottom tracking sensors or bottom tracking algorithm from sidescan itself. In our work, a fully convolutional network is used to estimate the depth contour and its aleatoric uncertainty from the sidescan images and sparse depth in an end-to-end fashion. The estimated depth is then used together with the range to calculate the point's 3D location on the seafloor. A high-quality bathymetric map can be reconstructed after fusing the depth predictions and the corresponding confidence measures from the neural networks. We show the improvement of the bathymetric map gained by using sparse depths with sidescan over estimates with sidescan alone. We also show the benefit of confidence weighting when fusing multiple bathymetric estimates into a single map.
Registration methods for point clouds have become a key component of many SLAM systems on autonomous vehicles. However, an accurate estimate of the uncertainty of such registration is a key requirement to a consistent fusion of this kind of measurements in a SLAM filter. This estimate, which is normally given as a covariance in the transformation computed between point cloud reference frames, has been modelled following different approaches, among which the most accurate is considered to be the Monte Carlo method. However, a Monte Carlo approximation is cumbersome to use inside a time-critical application such as online SLAM. Efforts have been made to estimate this covariance via machine learning using carefully designed features to abstract the raw point clouds. However, the performance of this approach is sensitive to the features chosen. We argue that it is possible to learn the features along with the covariance by working with the raw data and thus we propose a new approach based on PointNet. In this work, we train this network using the KL divergence between the learned uncertainty distribution and one computed by the Monte Carlo method as the loss. We test the performance of the general model presented applying it to our target use-case of SLAM with an autonomous underwater vehicle (AUV) restricted to the 2-dimensional registration of 3D bathymetric point clouds.
This paper studies the problem of detection and tracking of general objects with long-term dynamics, observed by a mobile robot moving in a large environment. A key problem is that due to the environment scale, it can only observe a subset of the objects at any given time. Since some time passes between observations of objects in different places, the objects might be moved when the robot is not there. We propose a model for this movement in which the objects typically only move locally, but with some small probability they jump longer distances, through what we call global motion. For filtering, we decompose the posterior over local and global movements into two linked processes. The posterior over the global movements and measurement associations is sampled, while we track the local movement analytically using Kalman filters. This novel filter is evaluated on point cloud data gathered autonomously by a mobile robot over an extended period of time. We show that tracking jumping objects is feasible, and that the proposed probabilistic treatment outperforms previous methods when applied to real world data. The key to efficient probabilistic tracking in this scenario is focused sampling of the object posteriors.
In this work, we present a method for tracking and learning the dynamics of all objects in a large scale robot environment. A mobile robot patrols the environment and visits the different locations one by one. Movable objects are discovered by change detection, and tracked throughout the robot deployment. For tracking, we extend the Rao-Blackwellized particle filter of previous work with birth and death processes, enabling the method to handle an arbitrary number of objects. Target births and associations are sampled using Gibbs sampling. The parameters of the system are then learnt using the Expectation Maximization algorithm in an unsupervised fashion. The system therefore enables learning of the dynamics of one particular environment, and of its objects. The algorithm is evaluated on data collected autonomously by a mobile robot in an office environment during a real-world deployment. We show that the algorithm automatically identifies and tracks the moving objects within 3D maps and infers plausible dynamics models, significantly decreasing the modeling bias of our previous work. The proposed method represents an improvement over previous methods for environment dynamics learning as it allows for learning of fine grained processes.
In this paper we introduce a system for unsupervised object discovery and segmentation of RGBD-images. The system models the sensor noise directly from data, allowing accurate segmentation without sensor specific hand tuning of measurement noise models making use of the recently introduced Statistical Inlier Estimation (SIE) method. Through a fully probabilistic formulation, the system is able to apply probabilistic inference, enabling reliable segmentation in previously challenging scenarios. In addition, we introduce new methods for filtering out false positives, significantly improving the signal to noise ratio. We show that the system significantly outperform state-of-the-art in on a challenging real-world dataset.
Thanks to the efforts of the robotics and autonomous systems community, robots are becoming ever more capable. There is also an increasing demand from end-users for autonomous service robots that can operate in real environments for extended periods. In the STRANDS project we are tackling this demand head-on by integrating state-of-the-art artificial intelligence and robotics research into mobile service robots, and deploying these systems for long-term installations in security and care environments. Over four deployments, our robots have been operational for a combined duration of 104 days autonomously performing end-user defined tasks, covering 116km in the process. In this article we describe the approach we have used to enable long-term autonomous operation in everyday environments, and how our robots are able to use their long run times to improve their own performance.