ETH Zürich




Abstract:The growing popularity of autonomous systems creates a need for reliable and efficient metric pose retrieval algorithms. Currently used approaches tend to rely on nearest neighbor search of binary descriptors to perform the 2D-3D matching and guarantee realtime capabilities on mobile platforms. These methods struggle, however, with the growing size of the map, changes in viewpoint or appearance, and visual aliasing present in the environment. The rigidly defined descriptor patterns only capture a limited neighborhood of the keypoint and completely ignore the overall visual context. We propose LandmarkBoost - an approach that, in contrast to the conventional 2D-3D matching methods, casts the search problem as a landmark classification task. We use a boosted classifier to classify landmark observations and directly obtain correspondences as classifier scores. We also introduce a formulation of visual context that is flexible, efficient to compute, and can capture relationships in the entire image plane. The original binary descriptors are augmented with contextual information and informative features are selected by the boosting framework. Through detailed experiments, we evaluate the retrieval quality and performance of LandmarkBoost, demonstrating that it outperforms common state-of-the-art descriptor matching methods.




Abstract:Rapid deployment and operation are key requirements in time critical application, such as Search and Rescue (SaR). Efficiently teleoperated ground robots can support first-responders in such situations. However, first-person view teleoperation is sub-optimal in difficult terrains, while a third-person perspective can drastically increase teleoperation performance. Here, we propose a Micro Aerial Vehicle (MAV)-based system that can autonomously provide third-person perspective to ground robots. While our approach is based on local visual servoing, it further leverages the global localization of several ground robots to seamlessly transfer between these ground robots in GPS-denied environments. Therewith one MAV can support multiple ground robots on a demand basis. Furthermore, our system enables different visual detection regimes, and enhanced operability, and return-home functionality. We evaluate our system in real-world SaR scenarios.




Abstract:Visual localization and mapping is a crucial capability to address many challenges in mobile robotics. It constitutes a robust, accurate and cost-effective approach for local and global pose estimation within prior maps. Yet, in highly dynamic environments, like crowded city streets, problems arise as major parts of the image can be covered by dynamic objects. Consequently, visual odometry pipelines often diverge and the localization systems malfunction as detected features are not consistent with the precomputed 3D model. In this work, we present an approach to automatically detect dynamic object instances to improve the robustness of vision-based localization and mapping in crowded environments. By training a convolutional neural network model with a combination of synthetic and real-world data, dynamic object instance masks are learned in a semi-supervised way. The real-world data can be collected with a standard camera and requires minimal further post-processing. Our experiments show that a wide range of dynamic objects can be reliably detected using the presented method. Promising performance is demonstrated on our own and also publicly available datasets, which also shows the generalization capabilities of this approach.




Abstract:Horticultural enterprises are becoming more sophisticated as the range of the crops they target expands. Requirements for enhanced efficiency and productivity have driven the demand for automating on-field operations. However, various problems remain yet to be solved for their reliable, safe deployment in real-world scenarios. This paper examines major research trends and current challenges in horticultural robotics. Specifically, our work focuses on sensing and perception in the three main horticultural procedures: pollination, yield estimation, and harvesting. For each task, we expose major issues arising from the unstructured, cluttered, and rugged nature of field environments, including variable lighting conditions and difficulties in fruit-specific detection, and highlight promising contemporary studies.




Abstract:We propose a novel scoring concept for visual place recognition based on nearest neighbor descriptor voting and demonstrate how the algorithm naturally emerges from the problem formulation. Based on the observation that the number of votes for matching places can be evaluated using a binomial distribution model, loop closures can be detected with high precision. By casting the problem into a probabilistic framework, we not only remove the need for commonly employed heuristic parameters but also provide a powerful score to classify matching and non-matching places. We present methods for both a 2D-2D pose-graph vertex matching and a 2D-3D landmark matching based on the above scoring. The approach maintains accuracy while being efficient enough for online application through the use of compact (low dimensional) descriptors and fast nearest neighbor retrieval techniques. The proposed methods are evaluated on several challenging datasets in varied environments, showing state-of-the-art results with high precision and high recall.


Abstract:This paper discusses a large-scale and long-term mapping and localization scenario using the maplab open-source framework. We present a brief overview of the specific algorithms in the system that enable building a consistent map from multiple sessions. We then demonstrate that such a map can be reused even a few months later for efficient 6-DoF localization and also new trajectories can be registered within the existing 3D model. The datasets presented in this paper are made publicly available.




Abstract:As off-the-shelf (OTS) autopilots become more widely available and user-friendly and the drone market expands, safer, more efficient, and more complex motion planning and control will become necessary for fixed-wing aerial robotic platforms. Considering typical low-level attitude stabilization available on OTS flight controllers, this paper first develops an approach for modeling and identification of the control augmented dynamics for a small fixed-wing Unmanned Aerial Vehicle (UAV). A high-level Nonlinear Model Predictive Controller (NMPC) is subsequently formulated for simultaneous airspeed stabilization, path following, and soft constraint handling, using the identified model for horizon propagation. The approach is explored in several exemplary flight experiments including path following of helix and connected Dubins Aircraft segments in high winds as well as a motor failure scenario. The cost function, insights on its weighting, and additional soft constraints used throughout the experimentation are discussed.




Abstract:In the absence of global positioning information, place recognition is a key capability for enabling localization, mapping and navigation in any environment. Most place recognition methods rely on images, point clouds, or a combination of both. In this work we leverage a segment extraction and matching approach to achieve place recognition in Light Detection and Ranging (LiDAR) based 3D point cloud maps. One challenge related to this approach is the recognition of segments despite changes in point of view or occlusion. We propose using a learning based method in order to reach a higher recall accuracy then previously proposed methods. Using Convolutional Neural Networks (CNNs), which are state-of-the-art classifiers, we propose a new approach to segment recognition based on learned descriptors. In this paper we compare the effectiveness of three different structures and training methods for CNNs. We demonstrate through several experiments on real-world data collected in an urban driving scenario that the proposed learning based methods outperform hand-crafted descriptors.




Abstract:This paper introduces fl\"uela driverless: the first autonomous racecar to win a Formula Student Driverless competition. In this competition, among other challenges, an autonomous racecar is tasked to complete 10 laps of a previously unknown racetrack as fast as possible and using only onboard sensing and computing. The key components of fl\"uela's design are its modular redundant sub-systems that allow robust performance despite challenging perceptual conditions or partial system failures. The paper presents the integration of key components of our autonomous racecar, i.e., system design, EKF-based state estimation, LiDAR-based perception, and particle filter-based SLAM. We perform an extensive experimental evaluation on real-world data, demonstrating the system's effectiveness by outperforming the next-best ranking team by almost half the time required to finish a lap. The autonomous racecar reaches lateral and longitudinal accelerations comparable to those achieved by experienced human drivers.




Abstract:Many scenarios require a robot to be able to explore its 3D environment online without human supervision. This is especially relevant for inspection tasks and search and rescue missions. To solve this high-dimensional path planning problem, sampling-based exploration algorithms have proven successful. However, these do not necessarily scale well to larger environments or spaces with narrow openings. This paper presents a 3D exploration planner based on the principles of Next-Best Views (NBVs). In this approach, a Micro-Aerial Vehicle (MAV) equipped with a limited field-of-view depth sensor randomly samples its configuration space to find promising future viewpoints. In order to obtain high sampling efficiency, our planner maintains and uses a history of visited places, and locally optimizes the robot's orientation with respect to unobserved space. We evaluate our method in several simulated scenarios, and compare it against a state-of-the-art exploration algorithm. The experiments show substantial improvements in exploration time ($2\times$ faster), computation time, and path length, and advantages in handling difficult situations such as escaping dead-ends (up to $20\times$ faster). Finally, we validate the on-line capability of our algorithm on a computational constrained real world MAV.