Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Philippe Giguère

SilvaScenes: Tree Segmentation and Species Classification from Under-Canopy Images in Natural Forests

Oct 10, 2025

David-Alexandre Duclos, William Guimont-Martin, Gabriel Jeanson, Arthur Larochelle-Tremblay, Théo Defosse, Frédéric Moore, Philippe Nolet, François Pomerleau, Philippe Giguère

Abstract:Interest in robotics for forest management is growing, but perception in complex, natural environments remains a significant hurdle. Conditions such as heavy occlusion, variable lighting, and dense vegetation pose challenges to automated systems, which are essential for precision forestry, biodiversity monitoring, and the automation of forestry equipment. These tasks rely on advanced perceptual capabilities, such as detection and fine-grained species classification of individual trees. Yet, existing datasets are inadequate to develop such perception systems, as they often focus on urban settings or a limited number of species. To address this, we present SilvaScenes, a new dataset for instance segmentation of tree species from under-canopy images. Collected across five bioclimatic domains in Quebec, Canada, SilvaScenes features 1476 trees from 24 species with annotations from forestry experts. We demonstrate the relevance and challenging nature of our dataset by benchmarking modern deep learning approaches for instance segmentation. Our results show that, while tree segmentation is easy, with a top mean average precision (mAP) of 67.65%, species classification remains a significant challenge with an mAP of only 35.69%. Our dataset and source code will be available at https://github.com/norlab-ulaval/SilvaScenes.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

UAV-Assisted Self-Supervised Terrain Awareness for Off-Road Navigation

Sep 26, 2024

Jean-Michel Fortin, Olivier Gamache, William Fecteau, Effie Daum, William Larrivée-Hardy, François Pomerleau, Philippe Giguère

Figure 1 for UAV-Assisted Self-Supervised Terrain Awareness for Off-Road Navigation

Figure 2 for UAV-Assisted Self-Supervised Terrain Awareness for Off-Road Navigation

Figure 3 for UAV-Assisted Self-Supervised Terrain Awareness for Off-Road Navigation

Figure 4 for UAV-Assisted Self-Supervised Terrain Awareness for Off-Road Navigation

Abstract:Terrain awareness is an essential milestone to enable truly autonomous off-road navigation. Accurately predicting terrain characteristics allows optimizing a vehicle's path against potential hazards. Recent methods use deep neural networks to predict traversability-related terrain properties in a self-supervised manner, relying on proprioception as a training signal. However, onboard cameras are inherently limited by their point-of-view relative to the ground, suffering from occlusions and vanishing pixel density with distance. This paper introduces a novel approach for self-supervised terrain characterization using an aerial perspective from a hovering drone. We capture terrain-aligned images while sampling the environment with a ground vehicle, effectively training a simple predictor for vibrations, bumpiness, and energy consumption. Our dataset includes 2.8 km of off-road data collected in forest environment, comprising 13 484 ground-based images and 12 935 aerial images. Our findings show that drone imagery improves terrain property prediction by 21.37 % on the whole dataset and 37.35 % in high vegetation, compared to ground robot images. We conduct ablation studies to identify the main causes of these performance improvements. We also demonstrate the real-world applicability of our approach by scouting an unseen area with a drone, planning and executing an optimized path on the ground.

* 7 pages, 5 figures, submitted to ICRA 2025

Via

Access Paper or Ask Questions

BoQ: A Place is Worth a Bag of Learnable Queries

May 12, 2024

Amar Ali-bey, Brahim Chaib-draa, Philippe Giguère

Abstract:In visual place recognition, accurately identifying and matching images of locations under varying environmental conditions and viewpoints remains a significant challenge. In this paper, we introduce a new technique, called Bag-of-Queries (BoQ), which learns a set of global queries designed to capture universal place-specific attributes. Unlike existing methods that employ self-attention and generate the queries directly from the input features, BoQ employs distinct learnable global queries, which probe the input features via cross-attention, ensuring consistent information aggregation. In addition, our technique provides an interpretable attention mechanism and integrates with both CNN and Vision Transformer backbones. The performance of BoQ is demonstrated through extensive experiments on 14 large-scale benchmarks. It consistently outperforms current state-of-the-art techniques including NetVLAD, MixVPR and EigenPlaces. Moreover, as a global retrieval technique (one-stage), BoQ surpasses two-stage retrieval methods, such as Patch-NetVLAD, TransVPR and R2Former, all while being orders of magnitude faster and more efficient. The code and model weights are publicly available at https://github.com/amaralibey/Bag-of-Queries.

* Accepted at CVPR 2024

Via

Access Paper or Ask Questions

Field Report on a Wearable and Versatile Solution for Field Acquisition and Exploration

Apr 30, 2024

Olivier Gamache, Jean-Michel Fortin, Matěj Boxan, François Pomerleau, Philippe Giguère

Abstract:This report presents a wearable plug-and-play platform for data acquisition in the field. The platform, extending a waterproof Pelican Case into a 20 kg backpack offers 5.5 hours of power autonomy, while recording data with two cameras, a lidar, an Inertial Measurement Unit (IMU), and a Global Navigation Satellite System (GNSS) receiver. The system only requires a single operator and is readily controlled with a built-in screen and buttons. Due to its small footprint, it offers greater flexibility than large vehicles typically deployed in off-trail environments. We describe the platform's design, detailing the mechanical parts, electrical components, and software stack. We explain the system's limitations, drawing from its extensive deployment spanning over 20 kilometers of trajectories across various seasons, environments, and weather conditions. We derive valuable lessons learned from these deployments and present several possible applications for the system. The possible use cases consider not only academic research but also insights from consultations with our industrial partners. The mechanical design including all CAD files, as well as the software stack, are publicly available at https://github.com/norlab-ulaval/backpack_workspace.

* 5 pages, 6 figures, Accepted for the Workshop on Field Robotics at ICRA2024

Via

Access Paper or Ask Questions

Proprioception Is All You Need: Terrain Classification for Boreal Forests

Mar 25, 2024

Damien LaRocque, William Guimont-Martin, David-Alexandre Duclos, Philippe Giguère, François Pomerleau

Figure 1 for Proprioception Is All You Need: Terrain Classification for Boreal Forests

Figure 2 for Proprioception Is All You Need: Terrain Classification for Boreal Forests

Figure 3 for Proprioception Is All You Need: Terrain Classification for Boreal Forests

Figure 4 for Proprioception Is All You Need: Terrain Classification for Boreal Forests

Abstract:Recent works in field robotics highlighted the importance of resiliency against different types of terrains. Boreal forests, in particular, are home to many mobility-impeding terrains that should be considered for off-road autonomous navigation. Also, being one of the largest land biomes on Earth, boreal forests are an area where autonomous vehicles are expected to become increasingly common. In this paper, we address this issue by introducing BorealTC, a publicly available dataset for proprioceptive-based terrain classification (TC). Recorded with a Husky A200, our dataset contains 116 min of Inertial Measurement Unit (IMU), motor current, and wheel odometry data, focusing on typical boreal forest terrains, notably snow, ice, and silty loam. Combining our dataset with another dataset from the state-of-the-art, we evaluate both a Convolutional Neural Network (CNN) and the novel state space model (SSM)-based Mamba architecture on a TC task. Interestingly, we show that while CNN outperforms Mamba on each separate dataset, Mamba achieves greater accuracy when trained on a combination of both. In addition, we demonstrate that Mamba's learning capacity is greater than a CNN for increasing amounts of data. We show that the combination of two TC datasets yields a latent space that can be interpreted with the properties of the terrains. We also discuss the implications of merging datasets on classification. Our source code and dataset are publicly available online: https://github.com/norlab-ulaval/BorealTC.

* Submitted to the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

Via

Access Paper or Ask Questions

Saturation-Aware Angular Velocity Estimation: Extending the Robustness of SLAM to Aggressive Motions

Oct 11, 2023

Simon-Pierre Deschênes, Dominic Baril, Matěj Boxan, Johann Laconte, Philippe Giguère, François Pomerleau

Figure 1 for Saturation-Aware Angular Velocity Estimation: Extending the Robustness of SLAM to Aggressive Motions

Figure 2 for Saturation-Aware Angular Velocity Estimation: Extending the Robustness of SLAM to Aggressive Motions

Figure 3 for Saturation-Aware Angular Velocity Estimation: Extending the Robustness of SLAM to Aggressive Motions

Figure 4 for Saturation-Aware Angular Velocity Estimation: Extending the Robustness of SLAM to Aggressive Motions

Abstract:We propose a novel angular velocity estimation method to increase the robustness of Simultaneous Localization And Mapping (SLAM) algorithms against gyroscope saturations induced by aggressive motions. Field robotics expose robots to various hazards, including steep terrains, landslides, and staircases, where substantial accelerations and angular velocities can occur if the robot loses stability and tumbles. These extreme motions can saturate sensor measurements, especially gyroscopes, which are the first sensors to become inoperative. While the structural integrity of the robot is at risk, the resilience of the SLAM framework is oftentimes given little consideration. Consequently, even if the robot is physically capable of continuing the mission, its operation will be compromised due to a corrupted representation of the world. Regarding this problem, we propose a way to estimate the angular velocity using accelerometers during extreme rotations caused by tumbling. We show that our method reduces the median localization error by 71.5 % in translation and 65.5 % in rotation and reduces the number of SLAM failures by 73.3 % on the collected data. We also propose the Tumbling-Induced Gyroscope Saturation (TIGS) dataset, which consists of outdoor experiments recording the motion of a lidar subject to angular velocities four times higher than other available datasets. The dataset is available online at https://github.com/norlab-ulaval/Norlab_wiki/wiki/TIGS-Dataset.

* 7 pages, 7 figures, submitted to the 2024 IEEE International Conference on Robotics and Automation (ICRA2024), Yokohama, Japan

Via

Access Paper or Ask Questions

Exposing the Unseen: Exposure Time Emulation for Offline Benchmarking of Vision Algorithms

Sep 22, 2023

Olivier Gamache, Jean-Michel Fortin, Matěj Boxan, François Pomerleau, Philippe Giguère

Abstract:Visual Odometry (VO) is one of the fundamental tasks in computer vision for robotics. However, its performance is deeply affected by High Dynamic Range (HDR) scenes, omnipresent outdoor. While new Automatic-Exposure (AE) approaches to mitigate this have appeared, their comparison in a reproducible manner is problematic. This stems from the fact that the behavior of AE depends on the environment, and it affects the image acquisition process. Consequently, AE has traditionally only been benchmarked in an online manner, making the experiments non-reproducible. To solve this, we propose a new methodology based on an emulator that can generate images at any exposure time. It leverages BorealHDR, a unique multi-exposure stereo dataset collected over 8.4 km, on 50 trajectories with challenging illumination conditions. Moreover, it contains pose ground truth for each image and a global 3D map, based on lidar data. We show that using these images acquired at different exposure times, we can emulate realistic images keeping a Root-Mean-Square Error (RMSE) below 1.78 % compared to ground truth images. To demonstrate the practicality of our approach for offline benchmarking, we compared three state-of-the-art AE algorithms on key elements of Visual Simultaneous Localization And Mapping (VSLAM) pipeline, against four baselines. Consequently, reproducible evaluation of AE is now possible, speeding up the development of future approaches. Our code and dataset are available online at this link: https://github.com/norlab-ulaval/BorealHDR

* 6 pages, 6 figures, submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

Via

Access Paper or Ask Questions

RTS-GT: Robotic Total Stations Ground Truthing dataset

Sep 21, 2023

Maxime Vaidis, Mohsen Hassanzadeh Shahraji, Effie Daum, William Dubois, Philippe Giguère, François Pomerleau

Figure 1 for RTS-GT: Robotic Total Stations Ground Truthing dataset

Figure 2 for RTS-GT: Robotic Total Stations Ground Truthing dataset

Figure 3 for RTS-GT: Robotic Total Stations Ground Truthing dataset

Figure 4 for RTS-GT: Robotic Total Stations Ground Truthing dataset

Abstract:Numerous datasets and benchmarks exist to assess and compare Simultaneous Localization and Mapping (SLAM) algorithms. Nevertheless, their precision must follow the rate at which SLAM algorithms improved in recent years. Moreover, current datasets fall short of comprehensive data-collection protocol for reproducibility and the evaluation of the precision or accuracy of the recorded trajectories. With this objective in mind, we proposed the Robotic Total Stations Ground Truthing dataset (RTS-GT) dataset to support localization research with the generation of six-Degrees Of Freedom (DOF) ground truth trajectories. This novel dataset includes six-DOF ground truth trajectories generated using a system of three Robotic Total Stations (RTSs) tracking moving robotic platforms. Furthermore, we compare the performance of the RTS-based system to a Global Navigation Satellite System (GNSS)-based setup. The dataset comprises around sixty experiments conducted in various conditions over a period of 17 months, and encompasses over 49 kilometers of trajectories, making it the most extensive dataset of RTS-based measurements to date. Additionally, we provide the precision of all poses for each experiment, a feature not found in the current state-of-the-art datasets. Our results demonstrate that RTSs provide measurements that are 22 times more stable than GNSS in various environmental settings, making them a valuable resource for SLAM benchmark development.

* 7 pages; Submitted to ICRA 2024

Via

Access Paper or Ask Questions

DRIVE: Data-driven Robot Input Vector Exploration

Sep 19, 2023

Dominic Baril, Simon-Pierre Deschênes, Luc Coupal, Cyril Goffin, Julien Lépine, Philippe Giguère, François Pomerleau

Abstract:An accurate motion model is a fundamental component of most autonomous navigation systems. While much work has been done on improving model formulation, no standard protocol exists for gathering empirical data required to train models. In this work, we address this issue by proposing Data-driven Robot Input Vector Exploration (DRIVE), a protocol that enables characterizing uncrewed ground vehicles (UGVs) input limits and gathering empirical model training data. We also propose a novel learned slip approach outperforming similar acceleration learning approaches. Our contributions are validated through an extensive experimental evaluation, cumulating over 7 km and 1.8 h of driving data over three distinct UGVs and four terrain types. We show that our protocol offers increased predictive performance over common human-driven data-gathering protocols. Furthermore, our protocol converges with 46 s of training data, almost four times less than the shortest human dataset gathering protocol. We show that the operational limit for our model is reached in extreme slip conditions encountered on surfaced ice. DRIVE is an efficient way of characterizing UGV motion in its operational conditions. Our code and dataset are both available online at this link: https://github.com/norlab-ulaval/DRIVE.

* 6 pages, 7 figures, submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

Via

Access Paper or Ask Questions

MaskBEV: Joint Object Detection and Footprint Completion for Bird's-eye View 3D Point Clouds

Jul 04, 2023

William Guimont-Martin, Jean-Michel Fortin, François Pomerleau, Philippe Giguère

Abstract:Recent works in object detection in LiDAR point clouds mostly focus on predicting bounding boxes around objects. This prediction is commonly achieved using anchor-based or anchor-free detectors that predict bounding boxes, requiring significant explicit prior knowledge about the objects to work properly. To remedy these limitations, we propose MaskBEV, a bird's-eye view (BEV) mask-based object detector neural architecture. MaskBEV predicts a set of BEV instance masks that represent the footprints of detected objects. Moreover, our approach allows object detection and footprint completion in a single pass. MaskBEV also reformulates the detection problem purely in terms of classification, doing away with regression usually done to predict bounding boxes. We evaluate the performance of MaskBEV on both SemanticKITTI and KITTI datasets while analyzing the architecture advantages and limitations.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions