Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jean-François Lalonde

Beyond the Pixel: a Photometrically Calibrated HDR Dataset for Luminance and Color Temperature Prediction

Apr 24, 2023

Christophe Bolduc, Justine Giroux, Marc Hébert, Claude Demers, Jean-François Lalonde

Abstract:Light plays an important role in human well-being. However, most computer vision tasks treat pixels without considering their relationship to physical luminance. To address this shortcoming, we present the first large-scale photometrically calibrated dataset of high dynamic range \ang{360} panoramas. Our key contribution is the calibration of an existing, uncalibrated HDR Dataset. We do so by accurately capturing RAW bracketed exposures simultaneously with a professional photometric measurement device (chroma meter) for multiple scenes across a variety of lighting conditions. Using the resulting measurements, we establish the calibration coefficients to be applied to the HDR images. The resulting dataset is a rich representation of indoor scenes which displays a wide range of illuminance and color temperature, and varied types of light sources. We exploit the dataset to introduce three novel tasks: where per-pixel luminance, per-pixel temperature and planar illuminance can be predicted from a single input image. Finally, we also capture another smaller calibrated dataset with a commercial \ang{360} camera, to experiment on generalization across cameras. We are optimistic that the release of our datasets and associated code will spark interest in physically accurate light estimation within the community.

Via

Access Paper or Ask Questions

Robust Unsupervised StyleGAN Image Restoration

Feb 13, 2023

Yohan Poirier-Ginter, Jean-François Lalonde

Abstract:GAN-based image restoration inverts the generative process to repair images corrupted by known degradations. Existing unsupervised methods must be carefully tuned for each task and degradation level. In this work, we make StyleGAN image restoration robust: a single set of hyperparameters works across a wide range of degradation levels. This makes it possible to handle combinations of several degradations, without the need to retune. Our proposed approach relies on a 3-phase progressive latent space extension and a conservative optimizer, which avoids the need for any additional regularization terms. Extensive experiments demonstrate robustness on inpainting, upsampling, denoising, and deartifacting at varying degradations levels, outperforming other StyleGAN-based inversion techniques. Our approach also favorably compares to diffusion-based restoration by yielding much more realistic inversion results. Code will be released upon publication.

* 8 pages, submitted at CVPR 2023

Via

Access Paper or Ask Questions

The Differentiable Lens: Compound Lens Search over Glass Surfaces and Materials for Object Detection

Dec 08, 2022

Geoffroi Côté, Fahim Mannan, Simon Thibault, Jean-François Lalonde, Felix Heide

Abstract:Most camera lens systems are designed in isolation, separately from downstream computer vision methods. Recently, joint optimization approaches that design lenses alongside other components of the image acquisition and processing pipeline -- notably, downstream neural networks -- have achieved improved imaging quality or better performance on vision tasks. However, these existing methods optimize only a subset of lens parameters and cannot optimize glass materials given their categorical nature. In this work, we develop a differentiable spherical lens simulation model that accurately captures geometrical aberrations. We propose an optimization strategy to address the challenges of lens design -- notorious for non-convex loss function landscapes and many manufacturing constraints -- that are exacerbated in joint optimization tasks. Specifically, we introduce quantized continuous glass variables to facilitate the optimization and selection of glass materials in an end-to-end design context, and couple this with carefully designed constraints to support manufacturability. In automotive object detection, we show improved detection performance over existing designs even when simplifying designs to two- or three-element lenses, despite significantly degrading the image quality. Code and optical designs will be made publicly available.

* 15 pages, 11 figures

Via

Access Paper or Ask Questions

Editable Indoor Lighting Estimation

Nov 09, 2022

Henrique Weber, Mathieu Garon, Jean-François Lalonde

Figure 1 for Editable Indoor Lighting Estimation

Figure 2 for Editable Indoor Lighting Estimation

Figure 3 for Editable Indoor Lighting Estimation

Figure 4 for Editable Indoor Lighting Estimation

Abstract:We present a method for estimating lighting from a single perspective image of an indoor scene. Previous methods for predicting indoor illumination usually focus on either simple, parametric lighting that lack realism, or on richer representations that are difficult or even impossible to understand or modify after prediction. We propose a pipeline that estimates a parametric light that is easy to edit and allows renderings with strong shadows, alongside with a non-parametric texture with high-frequency information necessary for realistic rendering of specular objects. Once estimated, the predictions obtained with our model are interpretable and can easily be modified by an artist/user with a few mouse clicks. Quantitative and qualitative results show that our approach makes indoor lighting estimation easier to handle by a casual user, while still producing competitive results.

* ECCV 2022

Via

Access Paper or Ask Questions

A Deep Perceptual Measure for Lens and Camera Calibration

Aug 25, 2022

Yannick Hold-Geoffroy, Dominique Piché-Meunier, Kalyan Sunkavalli, Jean-Charles Bazin, François Rameau, Jean-François Lalonde

Figure 1 for A Deep Perceptual Measure for Lens and Camera Calibration

Figure 2 for A Deep Perceptual Measure for Lens and Camera Calibration

Figure 3 for A Deep Perceptual Measure for Lens and Camera Calibration

Figure 4 for A Deep Perceptual Measure for Lens and Camera Calibration

Abstract:Image editing and compositing have become ubiquitous in entertainment, from digital art to AR and VR experiences. To produce beautiful composites, the camera needs to be geometrically calibrated, which can be tedious and requires a physical calibration target. In place of the traditional multi-images calibration process, we propose to infer the camera calibration parameters such as pitch, roll, field of view, and lens distortion directly from a single image using a deep convolutional neural network. We train this network using automatically generated samples from a large-scale panorama dataset, yielding competitive accuracy in terms of standard l2 error. However, we argue that minimizing such standard error metrics might not be optimal for many applications. In this work, we investigate human sensitivity to inaccuracies in geometric camera calibration. To this end, we conduct a large-scale human perception study where we ask participants to judge the realism of 3D objects composited with correct and biased camera calibration parameters. Based on this study, we develop a new perceptual measure for camera calibration and demonstrate that our deep calibration network outperforms previous single-image based calibration methods both on standard metrics as well as on this novel perceptual measure. Finally, we demonstrate the use of our calibration network for several applications, including virtual object insertion, image retrieval, and compositing. A demonstration of our approach is available at https://lvsn.github.io/deepcalib .

* 13 pages, 12 figures, project page (including live demo) available at https://lvsn.github.io/deepcalib. arXiv admin note: text overlap with arXiv:1712.01259

Via

Access Paper or Ask Questions

Casual Indoor HDR Radiance Capture from Omnidirectional Images

Aug 16, 2022

Pulkit Gera, Mohammad Reza Karimi Dastjerdi, Charles Renaud, P. J. Narayanan, Jean-François Lalonde

Figure 1 for Casual Indoor HDR Radiance Capture from Omnidirectional Images

Figure 2 for Casual Indoor HDR Radiance Capture from Omnidirectional Images

Figure 3 for Casual Indoor HDR Radiance Capture from Omnidirectional Images

Figure 4 for Casual Indoor HDR Radiance Capture from Omnidirectional Images

Abstract:We present PanoHDR-NeRF, a novel pipeline to casually capture a plausible full HDR radiance field of a large indoor scene without elaborate setups or complex capture protocols. First, a user captures a low dynamic range (LDR) omnidirectional video of the scene by freely waving an off-the-shelf camera around the scene. Then, an LDR2HDR network uplifts the captured LDR frames to HDR, subsequently used to train a tailored NeRF++ model. The resulting PanoHDR-NeRF pipeline can estimate full HDR panoramas from any location of the scene. Through experiments on a novel test dataset of a variety of real scenes with the ground truth HDR radiance captured at locations not seen during training, we show that PanoHDR-NeRF predicts plausible radiance from any scene point. We also show that the HDR images produced by PanoHDR-NeRF can synthesize correct lighting effects, enabling the augmentation of indoor scenes with synthetic objects that are lit correctly.

Via

Access Paper or Ask Questions

Robust Scene Inference under Noise-Blur Dual Corruptions

Jul 24, 2022

Bhavya Goyal, Jean-François Lalonde, Yin Li, Mohit Gupta

Figure 1 for Robust Scene Inference under Noise-Blur Dual Corruptions

Figure 2 for Robust Scene Inference under Noise-Blur Dual Corruptions

Figure 3 for Robust Scene Inference under Noise-Blur Dual Corruptions

Figure 4 for Robust Scene Inference under Noise-Blur Dual Corruptions

Abstract:Scene inference under low-light is a challenging problem due to severe noise in the captured images. One way to reduce noise is to use longer exposure during the capture. However, in the presence of motion (scene or camera motion), longer exposures lead to motion blur, resulting in loss of image information. This creates a trade-off between these two kinds of image degradations: motion blur (due to long exposure) vs. noise (due to short exposure), also referred as a dual image corruption pair in this paper. With the rise of cameras capable of capturing multiple exposures of the same scene simultaneously, it is possible to overcome this trade-off. Our key observation is that although the amount and nature of degradation varies for these different image captures, the semantic content remains the same across all images. To this end, we propose a method to leverage these multi exposure captures for robust inference under low-light and motion. Our method builds on a feature consistency loss to encourage similar results from these individual captures, and uses the ensemble of their final predictions for robust visual recognition. We demonstrate the effectiveness of our approach on simulated images as well as real captures with multiple exposures, and across the tasks of object detection and image classification.

* ICCP 2022 Camera Ready

Via

Access Paper or Ask Questions

Overparameterization Improves StyleGAN Inversion

May 12, 2022

Yohan Poirier-Ginter, Alexandre Lessard, Ryan Smith, Jean-François Lalonde

Figure 1 for Overparameterization Improves StyleGAN Inversion

Figure 2 for Overparameterization Improves StyleGAN Inversion

Figure 3 for Overparameterization Improves StyleGAN Inversion

Figure 4 for Overparameterization Improves StyleGAN Inversion

Abstract:Deep generative models like StyleGAN hold the promise of semantic image editing: modifying images by their content, rather than their pixel values. Unfortunately, working with arbitrary images requires inverting the StyleGAN generator, which has remained challenging so far. Existing inversion approaches obtain promising yet imperfect results, having to trade-off between reconstruction quality and downstream editability. To improve quality, these approaches must resort to various techniques that extend the model latent space after training. Taking a step back, we observe that these methods essentially all propose, in one way or another, to increase the number of free parameters. This suggests that inversion might be difficult because it is underconstrained. In this work, we address this directly and dramatically overparameterize the latent space, before training, with simple changes to the original StyleGAN architecture. Our overparameterization increases the available degrees of freedom, which in turn facilitates inversion. We show that this allows us to obtain near-perfect image reconstruction without the need for encoders nor for altering the latent space after training. Our approach also retains editability, which we demonstrate by realistically interpolating between images.

* 6 pages, accepted for publication at AI for Content Creation Workshop (CVPR 2022)

Via

Access Paper or Ask Questions

Guided Co-Modulated GAN for 360° Field of View Extrapolation

Apr 15, 2022

Mohammad Reza Karimi Dastjerdi, Yannick Hold-Geoffroy, Jonathan Eisenmann, Siavash Khodadadeh, Jean-François Lalonde

Figure 1 for Guided Co-Modulated GAN for 360° Field of View Extrapolation

Figure 2 for Guided Co-Modulated GAN for 360° Field of View Extrapolation

Figure 3 for Guided Co-Modulated GAN for 360° Field of View Extrapolation

Figure 4 for Guided Co-Modulated GAN for 360° Field of View Extrapolation

Abstract:We propose a method to extrapolate a 360{\deg} field of view from a single image that allows for user-controlled synthesis of the out-painted content. To do so, we propose improvements to an existing GAN-based in-painting architecture for out-painting panoramic image representation. Our method obtains state-of-the-art results and outperforms previous methods on standard image quality metrics. To allow controlled synthesis of out-painting, we introduce a novel guided co-modulation framework, which drives the image generation process with a common pretrained discriminative model. Doing so maintains the high visual quality of generated panoramas while enabling user-controlled semantic content in the extrapolated field of view. We demonstrate the state-of-the-art results of our method on field of view extrapolation both qualitatively and quantitatively, providing thorough analysis of our novel editing capabilities. Finally, we demonstrate that our approach benefits the photorealistic virtual insertion of highly glossy objects in photographs.

* 18 pages, 9 figures

Via

Access Paper or Ask Questions

Matching Feature Sets for Few-Shot Image Classification

Apr 02, 2022

Arman Afrasiyabi, Hugo Larochelle, Jean-François Lalonde, Christian Gagné

Figure 1 for Matching Feature Sets for Few-Shot Image Classification

Figure 2 for Matching Feature Sets for Few-Shot Image Classification

Figure 3 for Matching Feature Sets for Few-Shot Image Classification

Figure 4 for Matching Feature Sets for Few-Shot Image Classification

Abstract:In image classification, it is common practice to train deep networks to extract a single feature vector per input image. Few-shot classification methods also mostly follow this trend. In this work, we depart from this established direction and instead propose to extract sets of feature vectors for each image. We argue that a set-based representation intrinsically builds a richer representation of images from the base classes, which can subsequently better transfer to the few-shot classes. To do so, we propose to adapt existing feature extractors to instead produce sets of feature vectors from images. Our approach, dubbed SetFeat, embeds shallow self-attention mechanisms inside existing encoder architectures. The attention modules are lightweight, and as such our method results in encoders that have approximately the same number of parameters as their original versions. During training and inference, a set-to-set matching metric is used to perform image classification. The effectiveness of our proposed architecture and metrics is demonstrated via thorough experiments on standard few-shot datasets -- namely miniImageNet, tieredImageNet, and CUB -- in both the 1- and 5-shot scenarios. In all cases but one, our method outperforms the state-of-the-art.

* International Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Via

Access Paper or Ask Questions