Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mathieu Salzmann

CVLab EPFL Switzerland

SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation

Apr 30, 2021

Robin Chan, Krzysztof Lis, Svenja Uhlemeyer, Hermann Blum, Sina Honari, Roland Siegwart, Mathieu Salzmann, Pascal Fua, Matthias Rottmann

Figure 1 for SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation

Figure 2 for SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation

Figure 3 for SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation

Figure 4 for SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation

Abstract:State-of-the-art semantic or instance segmentation deep neural networks (DNNs) are usually trained on a closed set of semantic classes. As such, they are ill-equipped to handle previously-unseen objects. However, detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving, especially if they appear on the road ahead. While some methods have tackled the tasks of anomalous or out-of-distribution object segmentation, progress remains slow, in large part due to the lack of solid benchmarks; existing datasets either consist of synthetic data, or suffer from label inconsistencies. In this paper, we bridge this gap by introducing the "SegmentMeIfYouCan" benchmark. Our benchmark addresses two tasks: Anomalous object segmentation, which considers any previously-unseen object category; and road obstacle segmentation, which focuses on any object on the road, may it be known or unknown. We provide two corresponding datasets together with a test suite performing an in-depth method analysis, considering both established pixel-wise performance metrics and recent component-wise ones, which are insensitive to object sizes. We empirically evaluate multiple state-of-the-art baseline methods, including several specifically designed for anomaly / obstacle segmentation, on our datasets as well as on public ones, using our benchmark suite. The anomaly and obstacle segmentation results show that our datasets contribute to the diversity and challengingness of both dataset landscapes.

* 10 pages, 13 figures, website http://www.segmentmeifyoucan.com/

Via

Access Paper or Ask Questions

Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases

Apr 14, 2021

Jan Bednarik, Vladimir G. Kim, Siddhartha Chaudhuri, Shaifali Parashar, Mathieu Salzmann, Pascal Fua, Noam Aigerman

Figure 1 for Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases

Figure 2 for Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases

Figure 3 for Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases

Figure 4 for Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases

Abstract:We propose a method for the unsupervised reconstruction of a temporally-coherent sequence of surfaces from a sequence of time-evolving point clouds, yielding dense, semantically meaningful correspondences between all keyframes. We represent the reconstructed surface as an atlas, using a neural network. Using canonical correspondences defined via the atlas, we encourage the reconstruction to be as isometric as possible across frames, leading to semantically-meaningful reconstruction. Through experiments and comparisons, we empirically show that our method achieves results that exceed that state of the art in the accuracy of unsupervised correspondences and accuracy of surface reconstruction.

* 16 pages

Via

Access Paper or Ask Questions

Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

Apr 12, 2021

Kaicheng Yu, Rene Ranftl, Mathieu Salzmann

Figure 1 for Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

Figure 2 for Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

Figure 3 for Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

Figure 4 for Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

Abstract:Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware. However, recent works have empirically shown a ranking disorder between the performance of stand-alone architectures and that of the corresponding shared-weight networks. This violates the main assumption of weight-sharing NAS algorithms, thus limiting their effectiveness. We tackle this issue by proposing a regularization term that aims to maximize the correlation between the performance rankings of the shared-weight network and that of the standalone architectures using a small set of landmark architectures. We incorporate our regularization term into three different NAS algorithms and show that it consistently improves performance across algorithms, search-spaces, and tasks.

* Accepted to CVPR 2021

Via

Access Paper or Ask Questions

Modeling Object Dissimilarity for Deep Saliency Prediction

Apr 08, 2021

Bahar Aydemir, Deblina Bhattacharjee, Seungryong Kim, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk

Figure 1 for Modeling Object Dissimilarity for Deep Saliency Prediction

Figure 2 for Modeling Object Dissimilarity for Deep Saliency Prediction

Figure 3 for Modeling Object Dissimilarity for Deep Saliency Prediction

Figure 4 for Modeling Object Dissimilarity for Deep Saliency Prediction

Abstract:Saliency prediction has made great strides over the past two decades, with current techniques modeling low-level information, such as color, intensity and size contrasts, and high-level one, such as attention and gaze direction for entire objects. Despite this, these methods fail to account for the dissimilarity between objects, which humans naturally do. In this paper, we introduce a detection-guided saliency prediction network that explicitly models the differences between multiple objects, such as their appearance and size dissimilarities. Our approach is general, allowing us to fuse our object dissimilarities with features extracted by any deep saliency prediction network. As evidenced by our experiments, this consistently boosts the accuracy of the baseline networks, enabling us to outperform the state-of-the-art models on three saliency benchmarks, namely SALICON, MIT300 and CAT2000.

Via

Access Paper or Ask Questions

Robust Differentiable SVD

Apr 08, 2021

Wei Wang, Zheng Dang, Yinlin Hu, Pascal Fua, Mathieu Salzmann

Abstract:Eigendecomposition of symmetric matrices is at the heart of many computer vision algorithms. However, the derivatives of the eigenvectors tend to be numerically unstable, whether using the SVD to compute them analytically or using the Power Iteration (PI) method to approximate them. This instability arises in the presence of eigenvalues that are close to each other. This makes integrating eigendecomposition into deep networks difficult and often results in poor convergence, particularly when dealing with large matrices. While this can be mitigated by partitioning the data into small arbitrary groups, doing so has no theoretical basis and makes it impossible to exploit the full power of eigendecomposition. In previous work, we mitigated this using SVD during the forward pass and PI to compute the gradients during the backward pass. However, the iterative deflation procedure required to compute multiple eigenvectors using PI tends to accumulate errors and yield inaccurate gradients. Here, we show that the Taylor expansion of the SVD gradient is theoretically equivalent to the gradient obtained using PI without relying in practice on an iterative process and thus yields more accurate gradients. We demonstrate the benefits of this increased accuracy for image classification and style transfer.

* IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2021
* IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) PREPRINT 2021

Via

Access Paper or Ask Questions

Wide-Depth-Range 6D Object Pose Estimation in Space

Apr 01, 2021

Yinlin Hu, Sebastien Speierer, Wenzel Jakob, Pascal Fua, Mathieu Salzmann

Figure 1 for Wide-Depth-Range 6D Object Pose Estimation in Space

Figure 2 for Wide-Depth-Range 6D Object Pose Estimation in Space

Figure 3 for Wide-Depth-Range 6D Object Pose Estimation in Space

Figure 4 for Wide-Depth-Range 6D Object Pose Estimation in Space

Abstract:6D pose estimation in space poses unique challenges that are not commonly encountered in the terrestrial setting. One of the most striking differences is the lack of atmospheric scattering, allowing objects to be visible from a great distance while complicating illumination conditions. Currently available benchmark datasets do not place a sufficient emphasis on this aspect and mostly depict the target in close proximity. Prior work tackling pose estimation under large scale variations relies on a two-stage approach to first estimate scale, followed by pose estimation on a resized image patch. We instead propose a single-stage hierarchical end-to-end trainable network that is more robust to scale variations. We demonstrate that it outperforms existing approaches not only on images synthesized to resemble images taken in space but also on standard benchmarks.

* CVPR 2021

Via

Access Paper or Ask Questions

Temporally-Transferable Perturbations: Efficient, One-Shot Adversarial Attacks for Online Visual Object Trackers

Dec 30, 2020

Krishna Kanth Nakka, Mathieu Salzmann

Figure 1 for Temporally-Transferable Perturbations: Efficient, One-Shot Adversarial Attacks for Online Visual Object Trackers

Figure 2 for Temporally-Transferable Perturbations: Efficient, One-Shot Adversarial Attacks for Online Visual Object Trackers

Figure 3 for Temporally-Transferable Perturbations: Efficient, One-Shot Adversarial Attacks for Online Visual Object Trackers

Figure 4 for Temporally-Transferable Perturbations: Efficient, One-Shot Adversarial Attacks for Online Visual Object Trackers

Abstract:In recent years, the trackers based on Siamese networks have emerged as highly effective and efficient for visual object tracking (VOT). While these methods were shown to be vulnerable to adversarial attacks, as most deep networks for visual recognition tasks, the existing attacks for VOT trackers all require perturbing the search region of every input frame to be effective, which comes at a non-negligible cost, considering that VOT is a real-time task. In this paper, we propose a framework to generate a single temporally transferable adversarial perturbation from the object template image only. This perturbation can then be added to every search image, which comes at virtually no cost, and still, successfully fool the tracker. Our experiments evidence that our approach outperforms the state-of-the-art attacks on the standard VOT benchmarks in the untargeted scenario. Furthermore, we show that our formalism naturally extends to targeted attacks that force the tracker to follow any given trajectory by precomputing diverse directional perturbations.

Via

Access Paper or Ask Questions

Detecting Road Obstacles by Erasing Them

Dec 25, 2020

Krzysztof Lis, Sina Honari, Pascal Fua, Mathieu Salzmann

Figure 1 for Detecting Road Obstacles by Erasing Them

Figure 2 for Detecting Road Obstacles by Erasing Them

Figure 3 for Detecting Road Obstacles by Erasing Them

Figure 4 for Detecting Road Obstacles by Erasing Them

Abstract:Vehicles can encounter a myriad of obstacles on the road, and it is not feasible to record them all beforehand to train a detector. Our method selects image patches and inpaints them with the surrounding road texture, which tends to remove obstacles from those patches. It them uses a network trained to recognize discrepancies between the original patch and the inpainted one, which signals an erased obstacle. We also contribute a new dataset for monocular road obstacle detection, and show that our approach outperforms the state-of-the-art methods on both our new dataset and the standard Fishyscapes Lost & Found benchmark.

Via

Access Paper or Ask Questions

Unsupervised Domain Adaptation with Temporal-Consistent Self-Training for 3D Hand-Object Joint Reconstruction

Dec 21, 2020

Mengshi Qi, Edoardo Remelli, Mathieu Salzmann, Pascal Fua

Figure 1 for Unsupervised Domain Adaptation with Temporal-Consistent Self-Training for 3D Hand-Object Joint Reconstruction

Figure 2 for Unsupervised Domain Adaptation with Temporal-Consistent Self-Training for 3D Hand-Object Joint Reconstruction

Figure 3 for Unsupervised Domain Adaptation with Temporal-Consistent Self-Training for 3D Hand-Object Joint Reconstruction

Figure 4 for Unsupervised Domain Adaptation with Temporal-Consistent Self-Training for 3D Hand-Object Joint Reconstruction

Abstract:Deep learning-solutions for hand-object 3D pose and shape estimation are now very effective when an annotated dataset is available to train them to handle the scenarios and lighting conditions they will encounter at test time. Unfortunately, this is not always the case, and one often has to resort to training them on synthetic data, which does not guarantee that they will work well in real situations. In this paper, we introduce an effective approach to addressing this challenge by exploiting 3D geometric constraints within a cycle generative adversarial network (CycleGAN) to perform domain adaptation. Furthermore, in contrast to most existing works, which fail to leverage the rich temporal information available in unlabeled real videos as a source of supervision, we propose to enforce short- and long-term temporal consistency to fine-tune the domain-adapted model in a self-supervised fashion. We will demonstrate that our approach outperforms state-of-the-art 3D hand-object joint reconstruction methods on three widely-used benchmarks and will make our code publicly available.

* In submission

Via

Access Paper or Ask Questions

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Dec 10, 2020

Fatemeh Saleh, Sadegh Aliakbarian, Hamid Rezatofighi, Mathieu Salzmann, Stephen Gould

Figure 1 for Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Figure 2 for Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Figure 3 for Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Figure 4 for Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Abstract:Despite the recent advances in multiple object tracking (MOT), achieved by joint detection and tracking, dealing with long occlusions remains a challenge. This is due to the fact that such techniques tend to ignore the long-term motion information. In this paper, we introduce a probabilistic autoregressive motion model to score tracklet proposals by directly measuring their likelihood. This is achieved by training our model to learn the underlying distribution of natural tracklets. As such, our model allows us not only to assign new detections to existing tracklets, but also to inpaint a tracklet when an object has been lost for a long time, e.g., due to occlusion, by sampling tracklets so as to fill the gap caused by misdetections. Our experiments demonstrate the superiority of our approach at tracking objects in challenging sequences; it outperforms the state of the art in most standard MOT metrics on multiple MOT benchmark datasets, including MOT16, MOT17, and MOT20.

Via

Access Paper or Ask Questions