Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Christian Igel

Predicting urban tree cover from incomplete point labels and limited background information

Nov 20, 2023

Hui Zhang, Ankit Kariryaa, Venkanna Babu Guthula, Christian Igel, Stefan Oehmcke

Figure 1 for Predicting urban tree cover from incomplete point labels and limited background information

Figure 2 for Predicting urban tree cover from incomplete point labels and limited background information

Figure 3 for Predicting urban tree cover from incomplete point labels and limited background information

Figure 4 for Predicting urban tree cover from incomplete point labels and limited background information

Abstract:Trees inside cities are important for the urban microclimate, contributing positively to the physical and mental health of the urban dwellers. Despite their importance, often only limited information about city trees is available. Therefore in this paper, we propose a method for mapping urban trees in high-resolution aerial imagery using limited datasets and deep learning. Deep learning has become best-practice for this task, however, existing approaches rely on large and accurately labelled training datasets, which can be difficult and expensive to obtain. However, often noisy and incomplete data may be available that can be combined and utilized to solve more difficult tasks than those datasets were intended for. This paper studies how to combine accurate point labels of urban trees along streets with crowd-sourced annotations from an open geographic database to delineate city trees in remote sensing images, a task which is challenging even for humans. To that end, we perform semantic segmentation of very high resolution aerial imagery using a fully convolutional neural network. The main challenge is that our segmentation maps are sparsely annotated and incomplete. Small areas around the point labels of the street trees coming from official and crowd-sourced data are marked as foreground class. Crowd-sourced annotations of streets, buildings, etc. define the background class. Since the tree data is incomplete, we introduce a masking to avoid class confusion. Our experiments in Hamburg, Germany, showed that the system is able to produce tree cover maps, not limited to trees along streets, without providing tree delineations. We evaluated the method on manually labelled trees and show that performance drastically deteriorates if the open geographic database is not used.

Via

Access Paper or Ask Questions

Benchmarking Individual Tree Mapping with Sub-meter Imagery

Nov 14, 2023

Dimitri Gominski, Ankit Kariryaa, Martin Brandt, Christian Igel, Sizhuo Li, Maurice Mugabowindekwe, Rasmus Fensholt

Abstract:There is a rising interest in mapping trees using satellite or aerial imagery, but there is no standardized evaluation protocol for comparing and enhancing methods. In dense canopy areas, the high variability of tree sizes and their spatial proximity makes it arduous to define the quality of the predictions. Concurrently, object-centric approaches such as bounding box detection usuallyperform poorly on small and dense objects. It thus remains unclear what is the ideal framework for individual tree mapping, in regards to detection and segmentation approaches, convolutional neural networks and transformers. In this paper, we introduce an evaluation framework suited for individual tree mapping in any physical environment, with annotation costs and applicative goals in mind. We review and compare different approaches and deep architectures, and introduce a new method that we experimentally prove to be a good compromise between segmentation and detection.

Via

Access Paper or Ask Questions

Familiarity-Based Open-Set Recognition Under Adversarial Attacks

Nov 08, 2023

Philip Enevoldsen, Christian Gundersen, Nico Lang, Serge Belongie, Christian Igel

Abstract:Open-set recognition (OSR), the identification of novel categories, can be a critical component when deploying classification models in real-world applications. Recent work has shown that familiarity-based scoring rules such as the Maximum Softmax Probability (MSP) or the Maximum Logit Score (MLS) are strong baselines when the closed-set accuracy is high. However, one of the potential weaknesses of familiarity-based OSR are adversarial attacks. Here, we present gradient-based adversarial attacks on familiarity scores for both types of attacks, False Familiarity and False Novelty attacks, and evaluate their effectiveness in informed and uninformed settings on TinyImageNet.

* Published in: The 2nd Workshop and Challenges for Out-of-Distribution Generalization in Computer Vision, ICCV 2023

Via

Access Paper or Ask Questions

Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI

Sep 05, 2023

Dustin Wright, Christian Igel, Gabrielle Samuel, Raghavendra Selvan

Figure 1 for Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI

Figure 2 for Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI

Figure 3 for Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI

Figure 4 for Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI

Abstract:Artificial Intelligence (AI) is currently spearheaded by machine learning (ML) methods such as deep learning (DL) which have accelerated progress on many tasks thought to be out of reach of AI. These ML methods can often be compute hungry, energy intensive, and result in significant carbon emissions, a known driver of anthropogenic climate change. Additionally, the platforms on which ML systems run are associated with environmental impacts including and beyond carbon emissions. The solution lionized by both industry and the ML community to improve the environmental sustainability of ML is to increase the efficiency with which ML systems operate in terms of both compute and energy consumption. In this perspective, we argue that efficiency alone is not enough to make ML as a technology environmentally sustainable. We do so by presenting three high level discrepancies between the effect of efficiency on the environmental sustainability of ML when considering the many variables which it interacts with. In doing so, we comprehensively demonstrate, at multiple levels of granularity both technical and non-technical reasons, why efficiency is not enough to fully remedy the environmental impacts of ML. Based on this, we present and argue for systems thinking as a viable path towards improving the environmental sustainability of ML holistically.

* 24 pages; 6 figures

Via

Access Paper or Ask Questions

Smooth Monotonic Networks

Jun 01, 2023

Christian Igel

Abstract:Monotonicity constraints are powerful regularizers in statistical modelling. They can support fairness in computer supported decision making and increase plausibility in data-driven scientific models. The seminal min-max (MM) neural network architecture ensures monotonicity, but often gets stuck in undesired local optima during training because of vanishing gradients. We propose a simple modification of the MM network using strictly-increasing smooth non-linearities that alleviates this problem. The resulting smooth min-max (SMM) network module inherits the asymptotic approximation properties from the MM architecture. It can be used within larger deep learning systems trained end-to-end. The SMM module is considerably simpler and less computationally demanding than state-of-the-art neural networks for monotonic modelling. Still, in our experiments, it compared favorably to alternative neural and non-neural approaches in terms of generalization performance.

Via

Access Paper or Ask Questions

BuildSeg: A General Framework for the Segmentation of Buildings

Jan 15, 2023

Lei Li, Tianfang Zhang, Stefan Oehmcke, Fabian Gieseke, Christian Igel

Figure 1 for BuildSeg: A General Framework for the Segmentation of Buildings

Figure 2 for BuildSeg: A General Framework for the Segmentation of Buildings

Abstract:Building segmentation from aerial images and 3D laser scanning (LiDAR) is a challenging task due to the diversity of backgrounds, building textures, and image quality. While current research using different types of convolutional and transformer networks has considerably improved the performance on this task, even more accurate segmentation methods for buildings are desirable for applications such as automatic mapping. In this study, we propose a general framework termed \emph{BuildSeg} employing a generic approach that can be quickly applied to segment buildings. Different data sources were combined to increase generalization performance. The approach yields good results for different data sources as shown by experiments on high-resolution multi-spectral and LiDAR imagery of cities in Norway, Denmark and France. We applied ConvNeXt and SegFormer based models on the high resolution aerial image dataset from the MapAI-competition. The methods achieved an IOU of 0.7902 and a boundary IOU of 0.6185. We used post-processing to account for the rectangular shape of the objects. This increased the boundary IOU from 0.6185 to 0.6189.

Via

Access Paper or Ask Questions

Mask-FPAN: Semi-Supervised Face Parsing in the Wild With De-Occlusion and UV GAN

Dec 18, 2022

Lei Li, Tianfang Zhang, Stefan Oehmcke, Fabian Gieseke, Christian Igel

Figure 1 for Mask-FPAN: Semi-Supervised Face Parsing in the Wild With De-Occlusion and UV GAN

Figure 2 for Mask-FPAN: Semi-Supervised Face Parsing in the Wild With De-Occlusion and UV GAN

Figure 3 for Mask-FPAN: Semi-Supervised Face Parsing in the Wild With De-Occlusion and UV GAN

Figure 4 for Mask-FPAN: Semi-Supervised Face Parsing in the Wild With De-Occlusion and UV GAN

Abstract:Fine-grained semantic segmentation of a person's face and head, including facial parts and head components, has progressed a great deal in recent years. However, it remains a challenging task, whereby considering ambiguous occlusions and large pose variations are particularly difficult. To overcome these difficulties, we propose a novel framework termed Mask-FPAN. It uses a de-occlusion module that learns to parse occluded faces in a semi-supervised way. In particular, face landmark localization, face occlusionstimations, and detected head poses are taken into account. A 3D morphable face model combined with the UV GAN improves the robustness of 2D face parsing. In addition, we introduce two new datasets named FaceOccMask-HQ and CelebAMaskOcc-HQ for face paring work. The proposed Mask-FPAN framework addresses the face parsing problem in the wild and shows significant performance improvements with MIOU from 0.7353 to 0.9013 compared to the state-of-the-art on challenging face datasets.

* 9 pages

Via

Access Paper or Ask Questions

LR-CSNet: Low-Rank Deep Unfolding Network for Image Compressive Sensing

Dec 18, 2022

Tianfang Zhang, Lei Li, Christian Igel, Stefan Oehmcke, Fabian Gieseke, Zhenming Peng

Figure 1 for LR-CSNet: Low-Rank Deep Unfolding Network for Image Compressive Sensing

Figure 2 for LR-CSNet: Low-Rank Deep Unfolding Network for Image Compressive Sensing

Figure 3 for LR-CSNet: Low-Rank Deep Unfolding Network for Image Compressive Sensing

Figure 4 for LR-CSNet: Low-Rank Deep Unfolding Network for Image Compressive Sensing

Abstract:Deep unfolding networks (DUNs) have proven to be a viable approach to compressive sensing (CS). In this work, we propose a DUN called low-rank CS network (LR-CSNet) for natural image CS. Real-world image patches are often well-represented by low-rank approximations. LR-CSNet exploits this property by adding a low-rank prior to the CS optimization task. We derive a corresponding iterative optimization procedure using variable splitting, which is then translated to a new DUN architecture. The architecture uses low-rank generation modules (LRGMs), which learn low-rank matrix factorizations, as well as gradient descent and proximal mappings (GDPMs), which are proposed to extract high-frequency features to refine image details. In addition, the deep features generated at each reconstruction stage in the DUN are transferred between stages to boost the performance. Our extensive experiments on three widely considered datasets demonstrate the promising performance of LR-CSNet compared to state-of-the-art methods in natural image CS.

Via

Access Paper or Ask Questions

Energy Consumption-Aware Tabular Benchmarks for Neural Architecture Search

Oct 12, 2022

Pedram Bakhtiarifard, Christian Igel, Raghavendra Selvan

Figure 1 for Energy Consumption-Aware Tabular Benchmarks for Neural Architecture Search

Figure 2 for Energy Consumption-Aware Tabular Benchmarks for Neural Architecture Search

Figure 3 for Energy Consumption-Aware Tabular Benchmarks for Neural Architecture Search

Figure 4 for Energy Consumption-Aware Tabular Benchmarks for Neural Architecture Search

Abstract:The demand for large-scale computational resources for Neural Architecture Search (NAS) has been lessened by tabular benchmarks for NAS. Evaluating NAS strategies is now possible on extensive search spaces and at a moderate computational cost. But so far, NAS has mainly focused on maximising performance on some hold-out validation/test set. However, energy consumption is a partially conflicting objective that should not be neglected. We hypothesise that constraining NAS to include the energy consumption of training the models could reveal a sub-space of undiscovered architectures that are more computationally efficient with a smaller carbon footprint. To support the hypothesis, an existing tabular benchmark for NAS is augmented with the energy consumption of each architecture. We then perform multi-objective optimisation that includes energy consumption as an additional objective. We demonstrate the usefulness of multi-objective NAS for uncovering the trade-off between performance and energy consumption as well as for finding more energy-efficient architectures. The updated tabular benchmark, EC-NAS-Bench, is open-sourced to encourage the further exploration of energy consumption-aware NAS.

* Source code at https://github.com/PedramBakh/EC-NAS-Bench

Via

Access Paper or Ask Questions

Self-Supervised Speech Representation Learning: A Review

May 21, 2022

Abdelrahman Mohamed, Hung-yi Lee, Lasse Borgholt, Jakob D. Havtorn, Joakim Edin, Christian Igel, Katrin Kirchhoff, Shang-Wen Li, Karen Livescu, Lars Maaløe(+2 more)

Figure 1 for Self-Supervised Speech Representation Learning: A Review

Figure 2 for Self-Supervised Speech Representation Learning: A Review

Figure 3 for Self-Supervised Speech Representation Learning: A Review

Figure 4 for Self-Supervised Speech Representation Learning: A Review

Abstract:Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and languages for which only limited labeled data is available. Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains. Such methods have shown success in natural language processing and computer vision domains, achieving new levels of performance while reducing the number of labels required for many downstream scenarios. Speech representation learning is experiencing similar progress in three main categories: generative, contrastive, and predictive methods. Other approaches rely on multi-modal data for pre-training, mixing text or visual data streams with speech. Although self-supervised speech representation is still a nascent research area, it is closely related to acoustic word embedding and learning with zero lexical resources, both of which have seen active research for many years. This review presents approaches for self-supervised speech representation learning and their connection to other research areas. Since many current methods focus solely on automatic speech recognition as a downstream task, we review recent efforts on benchmarking learned representations to extend the application beyond speech recognition.

Via

Access Paper or Ask Questions