Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stefanie Jegelka

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA

A Poincaré Inequality and Consistency Results for Signal Sampling on Large Graphs

Nov 17, 2023

Thien Le, Luana Ruiz, Stefanie Jegelka

Figure 1 for A Poincaré Inequality and Consistency Results for Signal Sampling on Large Graphs

Figure 2 for A Poincaré Inequality and Consistency Results for Signal Sampling on Large Graphs

Figure 3 for A Poincaré Inequality and Consistency Results for Signal Sampling on Large Graphs

Abstract:Large-scale graph machine learning is challenging as the complexity of learning models scales with the graph size. Subsampling the graph is a viable alternative, but sampling on graphs is nontrivial as graphs are non-Euclidean. Existing graph sampling techniques require not only computing the spectra of large matrices but also repeating these computations when the graph changes, e.g., grows. In this paper, we introduce a signal sampling theory for a type of graph limit -- the graphon. We prove a Poincar\'e inequality for graphon signals and show that complements of node subsets satisfying this inequality are unique sampling sets for Paley-Wiener spaces of graphon signals. Exploiting connections with spectral clustering and Gaussian elimination, we prove that such sampling sets are consistent in the sense that unique sampling sets on a convergent graph sequence converge to unique sampling sets on the graphon. We then propose a related graphon signal sampling algorithm for large graphs, and demonstrate its good empirical performance on graph machine learning tasks.

* 23 pages

Via

Access Paper or Ask Questions

Sample Complexity Bounds for Estimating Probability Divergences under Invariances

Nov 06, 2023

Behrooz Tahmasebi, Stefanie Jegelka

Figure 1 for Sample Complexity Bounds for Estimating Probability Divergences under Invariances

Abstract:Group-invariant probability distributions appear in many data-generative models in machine learning, such as graphs, point clouds, and images. In practice, one often needs to estimate divergences between such distributions. In this work, we study how the inherent invariances, with respect to any smooth action of a Lie group on a manifold, improve sample complexity when estimating the Wasserstein distance, the Sobolev Integral Probability Metrics (Sobolev IPMs), the Maximum Mean Discrepancy (MMD), and also the complexity of the density estimation problem (in the $L^2$ and $L^\infty$ distance). Our results indicate a two-fold gain: (1) reducing the sample complexity by a multiplicative factor corresponding to the group size (for finite groups) or the normalized volume of the quotient space (for groups of positive dimension); (2) improving the exponent in the convergence rate (for groups of positive dimension). These results are completely new for groups of positive dimension and extend recent bounds for finite group actions.

Via

Access Paper or Ask Questions

Are Graph Neural Networks Optimal Approximation Algorithms?

Oct 09, 2023

Morris Yau, Eric Lu, Nikolaos Karalias, Jessica Xu, Stefanie Jegelka

Abstract:In this work we design graph neural network architectures that can be used to obtain optimal approximation algorithms for a large class of combinatorial optimization problems using powerful algorithmic tools from semidefinite programming (SDP). Concretely, we prove that polynomial-sized message passing algorithms can represent the most powerful polynomial time algorithms for Max Constraint Satisfaction Problems assuming the Unique Games Conjecture. We leverage this result to construct efficient graph neural network architectures, OptGNN, that obtain high-quality approximate solutions on landmark combinatorial optimization problems such as Max Cut and maximum independent set. Our approach achieves strong empirical results across a wide range of real-world and synthetic datasets against both neural baselines and classical algorithms. Finally, we take advantage of OptGNN's ability to capture convex relaxations to design an algorithm for producing dual certificates of optimality (bounds on the optimal solution) from the learned embeddings of OptGNN.

* Updated figure 1

Via

Access Paper or Ask Questions

On the Stability of Expressive Positional Encodings for Graph Neural Networks

Oct 04, 2023

Yinan Huang, William Lu, Joshua Robinson, Yu Yang, Muhan Zhang, Stefanie Jegelka, Pan Li

Abstract:Designing effective positional encodings for graphs is key to building powerful graph transformers and enhancing message-passing graph neural networks. Although widespread, using Laplacian eigenvectors as positional encodings faces two fundamental challenges: (1) \emph{Non-uniqueness}: there are many different eigendecompositions of the same Laplacian, and (2) \emph{Instability}: small perturbations to the Laplacian could result in completely different eigenspaces, leading to unpredictable changes in positional encoding. Despite many attempts to address non-uniqueness, most methods overlook stability, leading to poor generalization on unseen graph structures. We identify the cause of instability to be a "hard partition" of eigenspaces. Hence, we introduce Stable and Expressive Positional Encodings (SPE), an architecture for processing eigenvectors that uses eigenvalues to "softly partition" eigenspaces. SPE is the first architecture that is (1) provably stable, and (2) universally expressive for basis invariant functions whilst respecting all symmetries of eigenvectors. Besides guaranteed stability, we prove that SPE is at least as expressive as existing methods, and highly capable of counting graph structures. Finally, we evaluate the effectiveness of our method on molecular property prediction, and out-of-distribution generalization tasks, finding improved generalization compared to existing positional encoding methods.

Via

Access Paper or Ask Questions

Context is Environment

Sep 20, 2023

Sharut Gupta, Stefanie Jegelka, David Lopez-Paz, Kartik Ahuja

Abstract:Two lines of work are taking the central stage in AI research. On the one hand, the community is making increasing efforts to build models that discard spurious correlations and generalize better in novel test environments. Unfortunately, the bitter lesson so far is that no proposal convincingly outperforms a simple empirical risk minimization baseline. On the other hand, large language models (LLMs) have erupted as algorithms able to learn in-context, generalizing on-the-fly to eclectic contextual circumstances that users enforce by means of prompting. In this paper, we argue that context is environment, and posit that in-context learning holds the key to better domain generalization. Via extensive theory and experiments, we show that paying attention to context$\unicode{x2013}\unicode{x2013}$unlabeled examples as they arrive$\unicode{x2013}\unicode{x2013}$allows our proposed In-Context Risk Minimization (ICRM) algorithm to zoom-in on the test environment risk minimizer, leading to significant out-of-distribution performance improvements. From all of this, two messages are worth taking home. Researchers in domain generalization should consider environment as context, and harness the adaptive power of in-context learning. Researchers in LLMs should consider context as environment, to better structure data towards generalization.

* 41 Pages, 4 Figures

Via

Access Paper or Ask Questions

Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning

Jun 24, 2023

Sharut Gupta, Joshua Robinson, Derek Lim, Soledad Villar, Stefanie Jegelka

Figure 1 for Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning

Figure 2 for Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning

Figure 3 for Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning

Figure 4 for Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning

Abstract:Self-supervised learning converts raw perceptual data such as images to a compact space where simple Euclidean distances measure meaningful variations in data. In this paper, we extend this formulation by adding additional geometric structure to the embedding space by enforcing transformations of input space to correspond to simple (i.e., linear) transformations of embedding space. Specifically, in the contrastive learning setting, we introduce an equivariance objective and theoretically prove that its minima forces augmentations on input space to correspond to rotations on the spherical embedding space. We show that merely combining our equivariant loss with a non-collapse term results in non-trivial representations, without requiring invariance to data augmentations. Optimal performance is achieved by also encouraging approximate invariance, where input augmentations correspond to small rotations. Our method, CARE: Contrastive Augmentation-induced Rotational Equivariance, leads to improved performance on downstream tasks, and ensures sensitivity in embedding space to important variations in data (e.g., color) that standard contrastive methods do not achieve. Code is available at https://github.com/Sharut/CARE.

* 22 pages

Via

Access Paper or Ask Questions

The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Jun 22, 2023

Khashayar Gatmiry, Zhiyuan Li, Ching-Yao Chuang, Sashank Reddi, Tengyu Ma, Stefanie Jegelka

Figure 1 for The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Figure 2 for The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Figure 3 for The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Figure 4 for The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

Abstract:Recent works on over-parameterized neural networks have shown that the stochasticity in optimizers has the implicit regularization effect of minimizing the sharpness of the loss function (in particular, the trace of its Hessian) over the family zero-loss solutions. More explicit forms of flatness regularization also empirically improve the generalization performance. However, it remains unclear why and when flatness regularization leads to better generalization. This work takes the first step toward understanding the inductive bias of the minimum trace of the Hessian solutions in an important setting: learning deep linear networks from linear measurements, also known as \emph{deep matrix factorization}. We show that for all depth greater than one, with the standard Restricted Isometry Property (RIP) on the measurements, minimizing the trace of Hessian is approximately equivalent to minimizing the Schatten 1-norm of the corresponding end-to-end matrix parameters (i.e., the product of all layer matrices), which in turn leads to better generalization. We empirically verify our theoretical findings on synthetic datasets.

Via

Access Paper or Ask Questions

Limits, approximation and size transferability for GNNs on sparse graphs via graphops

Jun 07, 2023

Thien Le, Stefanie Jegelka

Abstract:Can graph neural networks generalize to graphs that are different from the graphs they were trained on, e.g., in size? In this work, we study this question from a theoretical perspective. While recent work established such transferability and approximation results via graph limits, e.g., via graphons, these only apply non-trivially to dense graphs. To include frequently encountered sparse graphs such as bounded-degree or power law graphs, we take a perspective of taking limits of operators derived from graphs, such as the aggregation operation that makes up GNNs. This leads to the recently introduced limit notion of graphops (Backhausz and Szegedy, 2022). We demonstrate how the operator perspective allows us to develop quantitative bounds on the distance between a finite GNN and its limit on an infinite graph, as well as the distance between the GNN on graphs of different sizes that share structural properties, under a regularity assumption verified for various graph sequences. Our results hold for dense and sparse graphs, and various notions of graph limits.

* NeurIPS 2023 submission, 34 pages

Via

Access Paper or Ask Questions

The Exact Sample Complexity Gain from Invariances for Kernel Regression on Manifolds

Mar 24, 2023

Behrooz Tahmasebi, Stefanie Jegelka

Abstract:In practice, encoding invariances into models helps sample complexity. In this work, we tighten and generalize theoretical results on how invariances improve sample complexity. In particular, we provide minimax optimal rates for kernel ridge regression on any manifold, with a target function that is invariant to an arbitrary group action on the manifold. Our results hold for (almost) any group action, even groups of positive dimension. For a finite group, the gain increases the "effective" number of samples by the group size. For groups of positive dimension, the gain is observed by a reduction in the manifold's dimension, in addition to a factor proportional to the volume of the quotient space. Our proof takes the viewpoint of differential geometry, in contrast to the more common strategy of using invariant polynomials. Hence, this new geometric viewpoint on learning with invariances may be of independent interest.

Via

Access Paper or Ask Questions

Tetris-inspired detector with neural network for radiation mapping

Feb 07, 2023

Ryotaro Okabe, Shangjie Xue, Jiankai Yu, Tongtong Liu, Benoit Forget, Stefanie Jegelka, Gordon Kohse, Lin-wen Hu, Mingda Li

Abstract:In recent years, radiation mapping has attracted widespread research attention and increased public concerns on environmental monitoring. In terms of both materials and their configurations, radiation detectors have been developed to locate the directions and positions of the radiation sources. In this process, algorithm is essential in converting detector signals to radiation source information. However, due to the complex mechanisms of radiation-matter interaction and the current limitation of data collection, high-performance, low-cost radiation mapping is still challenging. Here we present a computational framework using Tetris-inspired detector pixels and machine learning for radiation mapping. Using inter-pixel padding to increase the contrast between pixels and neural network to analyze the detector readings, a detector with as few as four pixels can achieve high-resolution directional mapping. By further imposing Maximum a Posteriori (MAP) with a moving detector, further radiation position localization is achieved. Non-square, Tetris-shaped detector can further improve performance beyond the conventional grid-shaped detector. Our framework offers a new avenue for high quality radiation mapping with least number of detector pixels possible, and is anticipated to be capable to deploy for real-world radiation detection with moderate validation.

* 29 pages, 20 figures. Ryotaro Okabe and Shangjie Xue contributed equally to this work

Via

Access Paper or Ask Questions