Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Julien Mairal

LJK

On Regularization and Robustness of Deep Neural Networks

Sep 30, 2018

Alberto Bietti, Grégoire Mialon, Julien Mairal

Figure 1 for On Regularization and Robustness of Deep Neural Networks

Figure 2 for On Regularization and Robustness of Deep Neural Networks

Figure 3 for On Regularization and Robustness of Deep Neural Networks

Figure 4 for On Regularization and Robustness of Deep Neural Networks

Abstract:Despite their success, deep neural networks suffer from several drawbacks: they lack robustness to small changes of input data known as "adversarial examples" and training them with small amounts of annotated data is challenging. In this work, we study the connection between regularization and robustness by viewing neural networks as elements of a reproducing kernel Hilbert space (RKHS) of functions and by regularizing them using the RKHS norm. Even though this norm cannot be computed, we consider various approximations based on upper and lower bounds. These approximations lead to new strategies for regularization, but also to existing ones such as spectral norm penalties or constraints, gradient penalties, or adversarial training. Besides, the kernel framework allows us to obtain margin-based bounds on adversarial generalization. We study the obtained algorithms for learning on small datasets, learning adversarially robust models, and discuss implications for learning implicit generative models.

Via

Access Paper or Ask Questions

On the Importance of Visual Context for Data Augmentation in Scene Understanding

Sep 06, 2018

Nikita Dvornik, Julien Mairal, Cordelia Schmid

Figure 1 for On the Importance of Visual Context for Data Augmentation in Scene Understanding

Figure 2 for On the Importance of Visual Context for Data Augmentation in Scene Understanding

Figure 3 for On the Importance of Visual Context for Data Augmentation in Scene Understanding

Figure 4 for On the Importance of Visual Context for Data Augmentation in Scene Understanding

Abstract:Performing data augmentation for learning deep neural networks is known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves generalization. While simple image transformations such as changing color intensity or adding random noise can already improve predictive performance in most vision tasks, larger gains can be obtained by leveraging task-specific prior knowledge. In this work, we consider object detection and semantic segmentation and augment the training images by blending objects in existing scenes, using instance segmentation annotations. We observe that randomly pasting objects on images hurts the performance, unless the object is placed in the right context. To resolve this issue, we propose an explicit context model by using a convolutional neural network, which predicts whether an image region is suitable for placing a given object or not. In our experiments, we show that by using copy-paste data augmentation with context guidance we are able to improve detection and segmentation on the PASCAL VOC12 and COCO datasets, with significant gains when few labeled examples are available. We also show that the method is not limited to datasets that come with expensive pixel-wise instance annotations and can be used when only bounding box annotations are available, by employing weakly-supervised learning for instance masks approximation.

* arXiv admin note: substantial text overlap with arXiv:1807.07428

Via

Access Paper or Ask Questions

An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration

Jul 20, 2018

Hongzhou Lin, Julien Mairal, Zaid Harchaoui

Figure 1 for An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration

Figure 2 for An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration

Figure 3 for An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration

Figure 4 for An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration

Abstract:We propose an inexact variable-metric proximal point algorithm to accelerate gradient-based optimization algorithms. The proposed scheme, called QNing can be notably applied to incremental first-order methods such as the stochastic variance-reduced gradient descent algorithm (SVRG) and other randomized incremental optimization algorithms. QNing is also compatible with composite objectives, meaning that it has the ability to provide exactly sparse solutions when the objective involves a sparsity-inducing regularization. When combined with limited-memory BFGS rules, QNing is particularly effective to solve high-dimensional optimization problems, while enjoying a worst-case linear convergence rate for strongly convex problems. We present experimental results where QNing gives significant improvements over competing methods for training machine learning methods on large samples and in high dimensions.

Via

Access Paper or Ask Questions

Modeling Visual Context is Key to Augmenting Object Detection Datasets

Jul 19, 2018

Nikita Dvornik, Julien Mairal, Cordelia Schmid

Figure 1 for Modeling Visual Context is Key to Augmenting Object Detection Datasets

Figure 2 for Modeling Visual Context is Key to Augmenting Object Detection Datasets

Figure 3 for Modeling Visual Context is Key to Augmenting Object Detection Datasets

Figure 4 for Modeling Visual Context is Key to Augmenting Object Detection Datasets

Abstract:Performing data augmentation for learning deep neural networks is well known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves generalization. For object detection, classical approaches for data augmentation consist of generating images obtained by basic geometrical transformations and color changes of original training images. In this work, we go one step further and leverage segmentation annotations to increase the number of object instances present on training data. For this approach to be successful, we show that modeling appropriately the visual context surrounding objects is crucial to place them in the right environment. Otherwise, we show that the previous strategy actually hurts. With our context model, we achieve significant mean average precision improvements when few labeled examples are available on the VOC'12 benchmark.

* ECCV2018, Sep 2018, Munich, Germany. 2018

Via

Access Paper or Ask Questions

Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

Jun 19, 2018

Hongzhou Lin, Julien Mairal, Zaid Harchaoui

Figure 1 for Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

Figure 2 for Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

Figure 3 for Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

Figure 4 for Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

Abstract:We introduce a generic scheme for accelerating gradient-based optimization methods in the sense of Nesterov. The approach, called Catalyst, builds upon the inexact accelerated proximal point algorithm for minimizing a convex objective function, and consists of approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. One of the keys to achieve acceleration in theory and in practice is to solve these sub-problems with appropriate accuracy by using the right stopping criterion and the right warm-start strategy. We give practical guidelines to use Catalyst and present a comprehensive analysis of its global complexity. We show that Catalyst applies to a large class of algorithms, including gradient descent, block coordinate descent, incremental algorithms such as SAG, SAGA, SDCA, SVRG, MISO/Finito, and their proximal variants. For all of these methods, we establish faster rates using the Catalyst acceleration, for strongly convex and non-strongly convex objectives. We conclude with extensive experiments showing that acceleration is useful in practice, especially for ill-conditioned problems.

* Journal of Machine Learning Research (JMLR), 18(212):1--54, 2018
* link to publisher website: http://jmlr.org/papers/volume18/17-748/17-748.pdf

Via

Access Paper or Ask Questions

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure

Nov 15, 2017

Alberto Bietti, Julien Mairal

Figure 1 for Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure

Abstract:Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions. Unfortunately, these techniques are unable to deal with stochastic perturbations of input data, induced for example by data augmentation. In such cases, the objective is no longer a finite sum, and the main candidate for optimization is the stochastic gradient descent method (SGD). In this paper, we introduce a variance reduction approach for these settings when the objective is composite and strongly convex. The convergence rate outperforms SGD with a typically much smaller constant factor, which depends on the variance of gradient estimates only due to perturbations on a single example.

* Advances in Neural Information Processing Systems (NIPS), Dec 2017, Long Beach, CA, United States

Via

Access Paper or Ask Questions

Learning Neural Representations of Human Cognition across Many fMRI Studies

Nov 11, 2017

Arthur Mensch, Julien Mairal, Danilo Bzdok, Bertrand Thirion, Gaël Varoquaux

Figure 1 for Learning Neural Representations of Human Cognition across Many fMRI Studies

Figure 2 for Learning Neural Representations of Human Cognition across Many fMRI Studies

Figure 3 for Learning Neural Representations of Human Cognition across Many fMRI Studies

Figure 4 for Learning Neural Representations of Human Cognition across Many fMRI Studies

Abstract:Cognitive neuroscience is enjoying rapid increase in extensive public brain-imaging datasets. It opens the door to large-scale statistical models. Finding a unified perspective for all available data calls for scalable and automated solutions to an old challenge: how to aggregate heterogeneous information on brain function into a universal cognitive system that relates mental operations/cognitive processes/psychological tasks to brain networks? We cast this challenge in a machine-learning approach to predict conditions from statistical brain maps across different studies. For this, we leverage multi-task learning and multi-scale dimension reduction to learn low-dimensional representations of brain images that carry cognitive information and can be robustly associated with psychological stimuli. Our multi-dataset classification model achieves the best prediction performance on several large reference datasets, compared to models without cognitive-aware low-dimension representations, it brings a substantial performance boost to the analysis of small datasets, and can be introspected to identify universal template cognitive concepts.

* Advances in Neural Information Processing Systems, Dec 2017, Long Beach, United States. 2017

Via

Access Paper or Ask Questions

Stochastic Subsampling for Factorizing Huge Matrices

Oct 30, 2017

Arthur Mensch, Julien Mairal, Bertrand Thirion, Gael Varoquaux

Figure 1 for Stochastic Subsampling for Factorizing Huge Matrices

Figure 2 for Stochastic Subsampling for Factorizing Huge Matrices

Figure 3 for Stochastic Subsampling for Factorizing Huge Matrices

Figure 4 for Stochastic Subsampling for Factorizing Huge Matrices

Abstract:We present a matrix-factorization algorithm that scales to input matrices with both huge number of rows and columns. Learned factors may be sparse or dense and/or non-negative, which makes our algorithm suitable for dictionary learning, sparse component analysis, and non-negative matrix factorization. Our algorithm streams matrix columns while subsampling them to iteratively learn the matrix factors. At each iteration, the row dimension of a new sample is reduced by subsampling, resulting in lower time complexity compared to a simple streaming algorithm. Our method comes with convergence guarantees to reach a stationary point of the matrix-factorization problem. We demonstrate its efficiency on massive functional Magnetic Resonance Imaging data (2 TB), and on patches extracted from hyperspectral images (103 GB). For both problems, which involve different penalties on rows and columns, we obtain significant speed-ups compared to state-of-the-art algorithms.

* IEEE Transactions on Signal Processing, 2018, 66 (1), pp 113-128
* IEEE Transactions on Signal Processing, Institute of Electrical and Electronics Engineers, A Para\^itre

Via

Access Paper or Ask Questions

BlitzNet: A Real-Time Deep Network for Scene Understanding

Aug 09, 2017

Nikita Dvornik, Konstantin Shmelkov, Julien Mairal, Cordelia Schmid

Figure 1 for BlitzNet: A Real-Time Deep Network for Scene Understanding

Figure 2 for BlitzNet: A Real-Time Deep Network for Scene Understanding

Figure 3 for BlitzNet: A Real-Time Deep Network for Scene Understanding

Figure 4 for BlitzNet: A Real-Time Deep Network for Scene Understanding

Abstract:Real-time scene understanding has become crucial in many applications such as autonomous driving. In this paper, we propose a deep architecture, called BlitzNet, that jointly performs object detection and semantic segmentation in one forward pass, allowing real-time computations. Besides the computational gain of having a single network to perform several tasks, we show that object detection and semantic segmentation benefit from each other in terms of accuracy. Experimental results for VOC and COCO datasets show state-of-the-art performance for object detection and segmentation among real time systems.

Via

Access Paper or Ask Questions

Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Jun 09, 2017

Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal, Zaid Harchaoui

Figure 1 for Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Figure 2 for Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Figure 3 for Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Figure 4 for Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Abstract:We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. When the objective is convex, the proposed approach enjoys the same properties as the Catalyst approach of Lin et al. [22]. When the objective is nonconvex, it achieves the best known convergence rate to stationary points for first-order methods. Specifically, the proposed algorithm does not require knowledge about the convexity of the objective; yet, it obtains an overall worst-case efficiency of $\tilde{O}(\varepsilon^{-2})$ and, if the function is convex, the complexity reduces to the near-optimal rate $\tilde{O}(\varepsilon^{-2/3})$. We conclude the paper by showing promising experimental results obtained by applying the proposed approach to SVRG and SAGA for sparse matrix factorization and for learning neural networks.

Via

Access Paper or Ask Questions