Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oisin Mac Aodha

Generating Binary Species Range Maps

Aug 28, 2024

Filip Dorm, Christian Lange, Scott Loarie, Oisin Mac Aodha

Figure 1 for Generating Binary Species Range Maps

Figure 2 for Generating Binary Species Range Maps

Figure 3 for Generating Binary Species Range Maps

Figure 4 for Generating Binary Species Range Maps

Abstract:Accurately predicting the geographic ranges of species is crucial for assisting conservation efforts. Traditionally, range maps were manually created by experts. However, species distribution models (SDMs) and, more recently, deep learning-based variants offer a potential automated alternative. Deep learning-based SDMs generate a continuous probability representing the predicted presence of a species at a given location, which must be binarized by setting per-species thresholds to obtain binary range maps. However, selecting appropriate per-species thresholds to binarize these predictions is non-trivial as different species can require distinct thresholds. In this work, we evaluate different approaches for automatically identifying the best thresholds for binarizing range maps using presence-only data. This includes approaches that require the generation of additional pseudo-absence data, along with ones that only require presence data. We also propose an extension of an existing presence-only technique that is more robust to outliers. We perform a detailed evaluation of different thresholding techniques on the tasks of binary range estimation and large-scale fine-grained visual classification, and we demonstrate improved performance over existing pseudo-absence free approaches using our method.

* Workshop on Computer Vision for Ecology at ECCV 2024

Via

Access Paper or Ask Questions

Deep learning-based ecological analysis of camera trap images is impacted by training data quality and size

Aug 26, 2024

Omiros Pantazis, Peggy Bevan, Holly Pringle, Guilherme Braga Ferreira, Daniel J. Ingram, Emily Madsen, Liam Thomas, Dol Raj Thanet, Thakur Silwal, Santosh Rayamajhi(+3 more)

Figure 1 for Deep learning-based ecological analysis of camera trap images is impacted by training data quality and size

Figure 2 for Deep learning-based ecological analysis of camera trap images is impacted by training data quality and size

Figure 3 for Deep learning-based ecological analysis of camera trap images is impacted by training data quality and size

Figure 4 for Deep learning-based ecological analysis of camera trap images is impacted by training data quality and size

Abstract:Large wildlife image collections from camera traps are crucial for biodiversity monitoring, offering insights into species richness, occupancy, and activity patterns. However, manual processing of these data is time-consuming, hindering analytical processes. To address this, deep neural networks have been widely adopted to automate image analysis. Despite their growing use, the impact of model training decisions on downstream ecological metrics remains unclear. Here, we analyse camera trap data from an African savannah and an Asian sub-tropical dry forest to compare key ecological metrics derived from expert-generated species identifications with those generated from deep neural networks. We assess the impact of model architecture, training data noise, and dataset size on ecological metrics, including species richness, occupancy, and activity patterns. Our results show that while model architecture has minimal impact, large amounts of noise and reduced dataset size significantly affect these metrics. Nonetheless, estimated ecological metrics are resilient to considerable noise, tolerating up to 10% error in species labels and a 50% reduction in training set size without changing significantly. We also highlight that conventional metrics like classification error may not always be representative of a model's ability to accurately measure ecological metrics. We conclude that ecological metrics derived from deep neural network predictions closely match those calculated from expert labels and remain robust to variations in the factors explored. However, training decisions for deep neural networks can impact downstream ecological analysis. Therefore, practitioners should prioritize creating large, clean training sets and evaluate deep neural network solutions based on their ability to measure the ecological metrics of interest.

Via

Access Paper or Ask Questions

AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings

Jun 13, 2024

Jamie Watson, Filippo Aleotti, Mohamed Sayed, Zawar Qureshi, Oisin Mac Aodha, Gabriel Brostow, Michael Firman, Sara Vicente

Abstract:Extracting planes from a 3D scene is useful for downstream tasks in robotics and augmented reality. In this paper we tackle the problem of estimating the planar surfaces in a scene from posed images. Our first finding is that a surprisingly competitive baseline results from combining popular clustering algorithms with recent improvements in 3D geometry estimation. However, such purely geometric methods are understandably oblivious to plane semantics, which are crucial to discerning distinct planes. To overcome this limitation, we propose a method that predicts multi-view consistent plane embeddings that complement geometry when clustering points into planes. We show through extensive evaluation on the ScanNetV2 dataset that our new method outperforms existing approaches and our strong geometric baseline for the task of plane estimation.

* Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Via

Access Paper or Ask Questions

Labeled Data Selection for Category Discovery

Jun 07, 2024

Bingchen Zhao, Nico Lang, Serge Belongie, Oisin Mac Aodha

Figure 1 for Labeled Data Selection for Category Discovery

Figure 2 for Labeled Data Selection for Category Discovery

Figure 3 for Labeled Data Selection for Category Discovery

Figure 4 for Labeled Data Selection for Category Discovery

Abstract:Category discovery methods aim to find novel categories in unlabeled visual data. At training time, a set of labeled and unlabeled images are provided, where the labels correspond to the categories present in the images. The labeled data provides guidance during training by indicating what types of visual properties and features are relevant for performing discovery in the unlabeled data. As a result, changing the categories present in the labeled set can have a large impact on what is ultimately discovered in the unlabeled set. Despite its importance, the impact of labeled data selection has not been explored in the category discovery literature to date. We show that changing the labeled data can significantly impact discovery performance. Motivated by this, we propose two new approaches for automatically selecting the most suitable labeled data based on the similarity between the labeled and unlabeled data. Our observation is that, unlike in conventional supervised transfer learning, the best labeled is neither too similar, nor too dissimilar, to the unlabeled categories. Our resulting approaches obtains state-of-the-art discovery performance across a range of challenging fine-grained benchmark datasets.

Via

Access Paper or Ask Questions

GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Jun 07, 2024

Salvatore Esposito, Qingshan Xu, Kacper Kania, Charlie Hewitt, Octave Mariotti, Lohit Petikam, Julien Valentin, Arno Onken, Oisin Mac Aodha

Figure 1 for GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Figure 2 for GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Figure 3 for GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Figure 4 for GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions

Abstract:We introduce a new generative approach for synthesizing 3D geometry and images from single-view collections. Most existing approaches predict volumetric density to render multi-view consistent images. By employing volumetric rendering using neural radiance fields, they inherit a key limitation: the generated geometry is noisy and unconstrained, limiting the quality and utility of the output meshes. To address this issue, we propose GeoGen, a new SDF-based 3D generative model trained in an end-to-end manner. Initially, we reinterpret the volumetric density as a Signed Distance Function (SDF). This allows us to introduce useful priors to generate valid meshes. However, those priors prevent the generative model from learning details, limiting the applicability of the method to real-world scenarios. To alleviate that problem, we make the transformation learnable and constrain the rendered depth map to be consistent with the zero-level set of the SDF. Through the lens of adversarial training, we encourage the network to produce higher fidelity details on the output meshes. For evaluation, we introduce a synthetic dataset of human avatars captured from 360-degree camera angles, to overcome the challenges presented by real-world datasets, which often lack 3D consistency and do not cover all camera angles. Our experiments on multiple datasets show that GeoGen produces visually and quantitatively better geometry than the previous generative models based on neural radiance fields.

* Computer Vision and Pattern Recognition 2024

Via

Access Paper or Ask Questions

Enhancing 2D Representation Learning with a 3D Prior

Jun 04, 2024

Mehmet Aygün, Prithviraj Dhar, Zhicheng Yan, Oisin Mac Aodha, Rakesh Ranjan

Figure 1 for Enhancing 2D Representation Learning with a 3D Prior

Figure 2 for Enhancing 2D Representation Learning with a 3D Prior

Figure 3 for Enhancing 2D Representation Learning with a 3D Prior

Figure 4 for Enhancing 2D Representation Learning with a 3D Prior

Abstract:Learning robust and effective representations of visual data is a fundamental task in computer vision. Traditionally, this is achieved by training models with labeled data which can be expensive to obtain. Self-supervised learning attempts to circumvent the requirement for labeled data by learning representations from raw unlabeled visual data alone. However, unlike humans who obtain rich 3D information from their binocular vision and through motion, the majority of current self-supervised methods are tasked with learning from monocular 2D image collections. This is noteworthy as it has been demonstrated that shape-centric visual processing is more robust compared to texture-biased automated methods. Inspired by this, we propose a new approach for strengthening existing self-supervised methods by explicitly enforcing a strong 3D structural prior directly into the model during training. Through experiments, across a range of datasets, we demonstrate that our 3D aware representations are more robust compared to conventional self-supervised baselines.

Via

Access Paper or Ask Questions

Click to Grasp: Zero-Shot Precise Manipulation via Visual Diffusion Descriptors

Mar 21, 2024

Nikolaos Tsagkas, Jack Rome, Subramanian Ramamoorthy, Oisin Mac Aodha, Chris Xiaoxuan Lu

Abstract:Precise manipulation that is generalizable across scenes and objects remains a persistent challenge in robotics. Current approaches for this task heavily depend on having a significant number of training instances to handle objects with pronounced visual and/or geometric part ambiguities. Our work explores the grounding of fine-grained part descriptors for precise manipulation in a zero-shot setting by utilizing web-trained text-to-image diffusion-based generative models. We tackle the problem by framing it as a dense semantic part correspondence task. Our model returns a gripper pose for manipulating a specific part, using as reference a user-defined click from a source image of a visually different instance of the same object. We require no manual grasping demonstrations as we leverage the intrinsic object geometry and features. Practical experiments in a real-world tabletop scenario validate the efficacy of our approach, demonstrating its potential for advancing semantic-aware robotics manipulation. Web page: https://tsagkas.github.io/click2grasp

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Dec 20, 2023

Octave Mariotti, Oisin Mac Aodha, Hakan Bilen

Figure 1 for Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Figure 2 for Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Figure 3 for Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Figure 4 for Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Abstract:Recent progress in self-supervised representation learning has resulted in models that are capable of extracting image features that are not only effective at encoding image level, but also pixel-level, semantics. These features have been shown to be effective for dense visual semantic correspondence estimation, even outperforming fully-supervised methods. Nevertheless, current self-supervised approaches still fail in the presence of challenging image characteristics such as symmetries and repeated parts. To address these limitations, we propose a new approach for semantic correspondence estimation that supplements discriminative self-supervised features with 3D understanding via a weak geometric spherical prior. Compared to more involved 3D pipelines, our model only requires weak viewpoint information, and the simplicity of our spherical representation enables us to inject informative geometric priors into the model during training. We propose a new evaluation metric that better accounts for repeated part and symmetry-induced mistakes. We present results on the challenging SPair-71k dataset, where we show that our approach demonstrates is capable of distinguishing between symmetric views and repeated parts across many object categories, and also demonstrate that we can generalize to unseen classes on the AwA dataset.

Via

Access Paper or Ask Questions

Active Learning-Based Species Range Estimation

Nov 03, 2023

Christian Lange, Elijah Cole, Grant Van Horn, Oisin Mac Aodha

Figure 1 for Active Learning-Based Species Range Estimation

Figure 2 for Active Learning-Based Species Range Estimation

Figure 3 for Active Learning-Based Species Range Estimation

Figure 4 for Active Learning-Based Species Range Estimation

Abstract:We propose a new active learning approach for efficiently estimating the geographic range of a species from a limited number of on the ground observations. We model the range of an unmapped species of interest as the weighted combination of estimated ranges obtained from a set of different species. We show that it is possible to generate this candidate set of ranges by using models that have been trained on large weakly supervised community collected observation data. From this, we develop a new active querying approach that sequentially selects geographic locations to visit that best reduce our uncertainty over an unmapped species' range. We conduct a detailed evaluation of our approach and compare it to existing active learning methods using an evaluation dataset containing expert-derived ranges for one thousand species. Our results demonstrate that our method outperforms alternative active learning methods and approaches the performance of end-to-end trained models, even when only using a fraction of the data. This highlights the utility of active learning via transfer learned spatial representations for species range estimation. It also emphasizes the value of leveraging emerging large-scale crowdsourced datasets, not only for modeling a species' range, but also for actively discovering them.

* NeurIPS 2023

Via

Access Paper or Ask Questions

Whombat: An open-source annotation tool for machine learning development in bioacoustics

Aug 24, 2023

Santiago Martinez Balvanera, Oisin Mac Aodha, Matthew J. Weldy, Holly Pringle, Ella Browning, Kate E. Jones

Figure 1 for Whombat: An open-source annotation tool for machine learning development in bioacoustics

Abstract:1. Automated analysis of bioacoustic recordings using machine learning (ML) methods has the potential to greatly scale biodiversity monitoring efforts. The use of ML for high-stakes applications, such as conservation research, demands a data-centric approach with a focus on utilizing carefully annotated and curated evaluation and training data that is relevant and representative. Creating annotated datasets of sound recordings presents a number of challenges, such as managing large collections of recordings with associated metadata, developing flexible annotation tools that can accommodate the diverse range of vocalization profiles of different organisms, and addressing the scarcity of expert annotators. 2. We present Whombat a user-friendly, browser-based interface for managing audio recordings and annotation projects, with several visualization, exploration, and annotation tools. It enables users to quickly annotate, review, and share annotations, as well as visualize and evaluate a set of machine learning predictions on a dataset. The tool facilitates an iterative workflow where user annotations and machine learning predictions feedback to enhance model performance and annotation quality. 3. We demonstrate the flexibility of Whombat by showcasing two distinct use cases: an project aimed at enhancing automated UK bat call identification at the Bat Conservation Trust (BCT), and a collaborative effort among the USDA Forest Service and Oregon State University researchers exploring bioacoustic applications and extending automated avian classification models in the Pacific Northwest, USA. 4. Whombat is a flexible tool that can effectively address the challenges of annotation for bioacoustic research. It can be used for individual and collaborative work, hosted on a shared server or accessed remotely, or run on a personal computer without the need for coding skills.

* 17 pages, 2 figures, 2 tables, to be submitted to Methods in Ecology and Evolution

Via

Access Paper or Ask Questions