Alert button
Picture for Guillaume Obozinski

Guillaume Obozinski

Alert button

Ecole Polytechnique Fédérale de Lausanne, Swiss Data Science Center, ETH Zürich

An evaluation of deep learning models for predicting water depth evolution in urban floods

Feb 20, 2023
Stefania Russo, Nathanaël Perraudin, Steven Stalder, Fernando Perez-Cruz, Joao Paulo Leitao, Guillaume Obozinski, Jan Dirk Wegner

Figure 1 for An evaluation of deep learning models for predicting water depth evolution in urban floods
Figure 2 for An evaluation of deep learning models for predicting water depth evolution in urban floods
Figure 3 for An evaluation of deep learning models for predicting water depth evolution in urban floods
Figure 4 for An evaluation of deep learning models for predicting water depth evolution in urban floods

In this technical report we compare different deep learning models for prediction of water depth rasters at high spatial resolution. Efficient, accurate, and fast methods for water depth prediction are nowadays important as urban floods are increasing due to higher rainfall intensity caused by climate change, expansion of cities and changes in land use. While hydrodynamic models models can provide reliable forecasts by simulating water depth at every location of a catchment, they also have a high computational burden which jeopardizes their application to real-time prediction in large urban areas at high spatial resolution. Here, we propose to address this issue by using data-driven techniques. Specifically, we evaluate deep learning models which are trained to reproduce the data simulated by the CADDIES cellular-automata flood model, providing flood forecasts that can occur at different future time horizons. The advantage of using such models is that they can learn the underlying physical phenomena a priori, preventing manual parameter setting and computational burden. We perform experiments on a dataset consisting of two catchments areas within Switzerland with 18 simpler, short rainfall patterns and 4 long, more complex ones. Our results show that the deep learning models present in general lower errors compared to the other methods, especially for water depths $>0.5m$. However, when testing on more complex rainfall events or unseen catchment areas, the deep models do not show benefits over the simpler ones.

Viaarxiv icon

Optirank: classification for RNA-Seq data with optimal ranking reference genes

Jan 11, 2023
Paola Malsot, Filipe Martins, Didier Trono, Guillaume Obozinski

Figure 1 for Optirank: classification for RNA-Seq data with optimal ranking reference genes
Figure 2 for Optirank: classification for RNA-Seq data with optimal ranking reference genes
Figure 3 for Optirank: classification for RNA-Seq data with optimal ranking reference genes
Figure 4 for Optirank: classification for RNA-Seq data with optimal ranking reference genes

Classification algorithms using RNA-Sequencing (RNA-Seq) data as input are used in a variety of biological applications. By nature, RNA-Seq data is subject to uncontrolled fluctuations both within and especially across datasets, which presents a major difficulty for a trained classifier to generalize to an external dataset. Replacing raw gene counts with the rank of gene counts inside an observation has proven effective to mitigate this problem. However, the rank of a feature is by definition relative to all other features, including highly variable features that introduce noise in the ranking. To address this problem and obtain more robust ranks, we propose a logistic regression model, optirank, which learns simultaneously the parameters of the model and the genes to use as a reference set in the ranking. We show the effectiveness of this method on simulated data. We also consider real classification tasks, which present different kinds of distribution shifts between train and test data. Those tasks concern a variety of applications, such as cancer of unknown primary classification, identification of specific gene signatures, and determination of cell type in single-cell RNA-Seq datasets. On those real tasks, optirank performs at least as well as the vanilla logistic regression on classical ranks, while producing sparser solutions. In addition, to increase the robustness against dataset shifts, we propose a multi-source learning scheme and demonstrate its effectiveness when used in combination with rank-based classifiers.

Viaarxiv icon

Robust detection and attribution of climate change under interventions

Dec 09, 2022
Enikő Székely, Sebastian Sippel, Nicolai Meinshausen, Guillaume Obozinski, Reto Knutti

Figure 1 for Robust detection and attribution of climate change under interventions
Figure 2 for Robust detection and attribution of climate change under interventions
Figure 3 for Robust detection and attribution of climate change under interventions
Figure 4 for Robust detection and attribution of climate change under interventions

Fingerprints are key tools in climate change detection and attribution (D&A) that are used to determine whether changes in observations are different from internal climate variability (detection), and whether observed changes can be assigned to specific external drivers (attribution). We propose a direct D&A approach based on supervised learning to extract fingerprints that lead to robust predictions under relevant interventions on exogenous variables, i.e., climate drivers other than the target. We employ anchor regression, a distributionally-robust statistical learning method inspired by causal inference that extrapolates well to perturbed data under the interventions considered. The residuals from the prediction achieve either uncorrelatedness or mean independence with the exogenous variables, thus guaranteeing robustness. We define D&A as a unified hypothesis testing framework that relies on the same statistical model but uses different targets and test statistics. In the experiments, we first show that the CO2 forcing can be robustly predicted from temperature spatial patterns under strong interventions on the solar forcing. Second, we illustrate attribution to the greenhouse gases and aerosols while protecting against interventions on the aerosols and CO2 forcing, respectively. Our study shows that incorporating robustness constraints against relevant interventions may significantly benefit detection and attribution of climate change.

Viaarxiv icon

Empirical Bayes Transductive Meta-Learning with Synthetic Gradients

Apr 27, 2020
Shell Xu Hu, Pablo G. Moreno, Yang Xiao, Xi Shen, Guillaume Obozinski, Neil D. Lawrence, Andreas Damianou

Figure 1 for Empirical Bayes Transductive Meta-Learning with Synthetic Gradients
Figure 2 for Empirical Bayes Transductive Meta-Learning with Synthetic Gradients
Figure 3 for Empirical Bayes Transductive Meta-Learning with Synthetic Gradients
Figure 4 for Empirical Bayes Transductive Meta-Learning with Synthetic Gradients

We propose a meta-learning approach that learns from multiple tasks in a transductive setting, by leveraging the unlabeled query set in addition to the support set to generate a more powerful model for each task. To develop our framework, we revisit the empirical Bayes formulation for multi-task learning. The evidence lower bound of the marginal log-likelihood of empirical Bayes decomposes as a sum of local KL divergences between the variational posterior and the true posterior on the query set of each task. We derive a novel amortized variational inference that couples all the variational posteriors via a meta-model, which consists of a synthetic gradient network and an initialization network. Each variational posterior is derived from synthetic gradient descent to approximate the true posterior on the query set, although where we do not have access to the true gradient. Our results on the Mini-ImageNet and CIFAR-FS benchmarks for episodic few-shot classification outperform previous state-of-the-art methods. Besides, we conduct two zero-shot learning experiments to further explore the potential of the synthetic gradient.

* ICLR 2020 
Viaarxiv icon

Tensor Decompositions for temporal knowledge base completion

Apr 10, 2020
Timothée Lacroix, Guillaume Obozinski, Nicolas Usunier

Figure 1 for Tensor Decompositions for temporal knowledge base completion
Figure 2 for Tensor Decompositions for temporal knowledge base completion
Figure 3 for Tensor Decompositions for temporal knowledge base completion
Figure 4 for Tensor Decompositions for temporal knowledge base completion

Most algorithms for representation learning and link prediction in relational data have been designed for static data. However, the data they are applied to usually evolves with time, such as friend graphs in social networks or user interactions with items in recommender systems. This is also the case for knowledge bases, which contain facts such as (US, has president, B. Obama, [2009-2017]) that are valid only at certain points in time. For the problem of link prediction under temporal constraints, i.e., answering queries such as (US, has president, ?, 2012), we propose a solution inspired by the canonical decomposition of tensors of order 4. We introduce new regularization schemes and present an extension of ComplEx (Trouillon et al., 2016) that achieves state-of-the-art performance. Additionally, we propose a new dataset for knowledge base completion constructed from Wikidata, larger than previous benchmarks by an order of magnitude, as a new reference for evaluating temporal and non-temporal link prediction methods.

Viaarxiv icon

Learning the effect of latent variables in Gaussian Graphical models with unobserved variables

Jul 31, 2018
Marina Vinyes, Guillaume Obozinski

Figure 1 for Learning the effect of latent variables in Gaussian Graphical models with unobserved variables
Figure 2 for Learning the effect of latent variables in Gaussian Graphical models with unobserved variables
Figure 3 for Learning the effect of latent variables in Gaussian Graphical models with unobserved variables
Figure 4 for Learning the effect of latent variables in Gaussian Graphical models with unobserved variables

The edge structure of the graph defining an undirected graphical model describes precisely the structure of dependence between the variables in the graph. In many applications, the dependence structure is unknown and it is desirable to learn it from data, often because it is a preliminary step to be able to ascertain causal effects. This problem, known as structure learning, is hard in general, but for Gaussian graphical models it is slightly easier because the structure of the graph is given by the sparsity pattern of the precision matrix of the joint distribution, and because independence coincides with decorrelation. A major difficulty too often ignored in structure learning is the fact that if some variables are not observed, the marginal dependence graph over the observed variables will possibly be significantly more complex and no longer reflect the direct dependencies that are potentially associated with causal effects. In this work, we consider a family of latent variable Gaussian graphical models in which the graph of the joint distribution between observed and unobserved variables is sparse, and the unobserved variables are conditionally independent given the others. Prior work was able to recover the connectivity between observed variables, but could only identify the subspace spanned by unobserved variables, whereas we propose a convex optimization formulation based on structured matrix sparsity to estimate the complete connectivity of the complete graph including unobserved variables, given the knowledge of the number of missing variables, and a priori knowledge of their level of connectivity. Our formulation is supported by a theoretical result of identifiability of the latent dependence structure for sparse graphs in the infinite data limit. We propose an algorithm leveraging recent active set methods, which performs well in the experiments on synthetic data.

Viaarxiv icon

Canonical Tensor Decomposition for Knowledge Base Completion

Jun 19, 2018
Timothée Lacroix, Nicolas Usunier, Guillaume Obozinski

Figure 1 for Canonical Tensor Decomposition for Knowledge Base Completion
Figure 2 for Canonical Tensor Decomposition for Knowledge Base Completion
Figure 3 for Canonical Tensor Decomposition for Knowledge Base Completion
Figure 4 for Canonical Tensor Decomposition for Knowledge Base Completion

The problem of Knowledge Base Completion can be framed as a 3rd-order binary tensor completion problem. In this light, the Canonical Tensor Decomposition (CP) (Hitchcock, 1927) seems like a natural solution; however, current implementations of CP on standard Knowledge Base Completion benchmarks are lagging behind their competitors. In this work, we attempt to understand the limits of CP for knowledge base completion. First, we motivate and test a novel regularizer, based on tensor nuclear $p$-norms. Then, we present a reformulation of the problem that makes it invariant to arbitrary choices in the inclusion of predicates or their reciprocals in the dataset. These two methods combined allow us to beat the current state of the art on several datasets with a CP decomposition, and obtain even better results using the more advanced ComplEx model.

Viaarxiv icon

Tight convex relaxations for sparse matrix factorization

Dec 04, 2014
Emile Richard, Guillaume Obozinski, Jean-Philippe Vert

Figure 1 for Tight convex relaxations for sparse matrix factorization
Figure 2 for Tight convex relaxations for sparse matrix factorization
Figure 3 for Tight convex relaxations for sparse matrix factorization

Based on a new atomic norm, we propose a new convex formulation for sparse matrix factorization problems in which the number of nonzero elements of the factors is assumed fixed and known. The formulation counts sparse PCA with multiple factors, subspace clustering and low-rank sparse bilinear regression as potential applications. We compute slow rates and an upper bound on the statistical dimension of the suggested norm for rank 1 matrices, showing that its statistical dimension is an order of magnitude smaller than the usual $\ell\_1$-norm, trace norm and their combinations. Even though our convex formulation is in theory hard and does not lead to provably polynomial time algorithmic schemes, we propose an active set algorithm leveraging the structure of the convex problem to solve it and show promising numerical results.

Viaarxiv icon

Domain adaptation for sequence labeling using hidden Markov models

Dec 14, 2013
Edouard Grave, Guillaume Obozinski, Francis Bach

Figure 1 for Domain adaptation for sequence labeling using hidden Markov models
Figure 2 for Domain adaptation for sequence labeling using hidden Markov models

Most natural language processing systems based on machine learning are not robust to domain shift. For example, a state-of-the-art syntactic dependency parser trained on Wall Street Journal sentences has an absolute drop in performance of more than ten points when tested on textual data from the Web. An efficient solution to make these methods more robust to domain shift is to first learn a word representation using large amounts of unlabeled data from both domains, and then use this representation as features in a supervised learning algorithm. In this paper, we propose to use hidden Markov models to learn word representations for part-of-speech tagging. In particular, we study the influence of using data from the source, the target or both domains to learn the representation and the different ways to represent words using an HMM.

* New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks (NIPS Workshop) (2013) 
Viaarxiv icon

On the Equivalence between Herding and Conditional Gradient Algorithms

Sep 11, 2012
Francis Bach, Simon Lacoste-Julien, Guillaume Obozinski

Figure 1 for On the Equivalence between Herding and Conditional Gradient Algorithms
Figure 2 for On the Equivalence between Herding and Conditional Gradient Algorithms
Figure 3 for On the Equivalence between Herding and Conditional Gradient Algorithms
Figure 4 for On the Equivalence between Herding and Conditional Gradient Algorithms

We show that the herding procedure of Welling (2009) takes exactly the form of a standard convex optimization algorithm--namely a conditional gradient algorithm minimizing a quadratic moment discrepancy. This link enables us to invoke convergence results from convex optimization and to consider faster alternatives for the task of approximating integrals in a reproducing kernel Hilbert space. We study the behavior of the different variants through numerical simulations. The experiments indicate that while we can improve over herding on the task of approximating integrals, the original herding algorithm tends to approach more often the maximum entropy distribution, shedding more light on the learning bias behind herding.

* ICML 2012 International Conference on Machine Learning, Edimburgh : Royaume-Uni (2012)  
Viaarxiv icon