Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Benjamin Recht

Do Offline Metrics Predict Online Performance in Recommender Systems?

Nov 07, 2020

Karl Krauth, Sarah Dean, Alex Zhao, Wenshuo Guo, Mihaela Curmei, Benjamin Recht, Michael I. Jordan

Figure 1 for Do Offline Metrics Predict Online Performance in Recommender Systems?

Figure 2 for Do Offline Metrics Predict Online Performance in Recommender Systems?

Figure 3 for Do Offline Metrics Predict Online Performance in Recommender Systems?

Figure 4 for Do Offline Metrics Predict Online Performance in Recommender Systems?

Abstract:Recommender systems operate in an inherently dynamical setting. Past recommendations influence future behavior, including which data points are observed and how user preferences change. However, experimenting in production systems with real user dynamics is often infeasible, and existing simulation-based approaches have limited scale. As a result, many state-of-the-art algorithms are designed to solve supervised learning problems, and progress is judged only by offline metrics. In this work we investigate the extent to which offline metrics predict online performance by evaluating eleven recommenders across six controlled simulated environments. We observe that offline metrics are correlated with online performance over a range of environments. However, improvements in offline metrics lead to diminishing returns in online performance. Furthermore, we observe that the ranking of recommenders varies depending on the amount of initial offline data available. We study the impact of adding exploration strategies, and observe that their effectiveness, when compared to greedy recommendation, is highly dependent on the recommendation algorithm. We provide the environments and recommenders described in this paper as Reclab: an extensible ready-to-use simulation framework at https://github.com/berkeley-reclab/RecLab.

Via

Access Paper or Ask Questions

Guaranteeing Safety of Learned Perception Modules via Measurement-Robust Control Barrier Functions

Oct 30, 2020

Sarah Dean, Andrew J. Taylor, Ryan K. Cosner, Benjamin Recht, Aaron D. Ames

Figure 1 for Guaranteeing Safety of Learned Perception Modules via Measurement-Robust Control Barrier Functions

Figure 2 for Guaranteeing Safety of Learned Perception Modules via Measurement-Robust Control Barrier Functions

Figure 3 for Guaranteeing Safety of Learned Perception Modules via Measurement-Robust Control Barrier Functions

Abstract:Modern nonlinear control theory seeks to develop feedback controllers that endow systems with properties such as safety and stability. The guarantees ensured by these controllers often rely on accurate estimates of the system state for determining control actions. In practice, measurement model uncertainty can lead to error in state estimates that degrades these guarantees. In this paper, we seek to unify techniques from control theory and machine learning to synthesize controllers that achieve safety in the presence of measurement model uncertainty. We define the notion of a Measurement-Robust Control Barrier Function (MR-CBF) as a tool for determining safe control inputs when facing measurement model uncertainty. Furthermore, MR-CBFs are used to inform sampling methodologies for learning-based perception systems and quantify tolerable error in the resulting learned models. We demonstrate the efficacy of MR-CBFs in achieving safety with measurement model uncertainty on a simulated Segway system.

Via

Access Paper or Ask Questions

A Generalizable and Accessible Approach to Machine Learning with Global Satellite Imagery

Oct 16, 2020

Esther Rolf, Jonathan Proctor, Tamma Carleton, Ian Bolliger, Vaishaal Shankar, Miyabi Ishihara, Benjamin Recht, Solomon Hsiang

Figure 1 for A Generalizable and Accessible Approach to Machine Learning with Global Satellite Imagery

Figure 2 for A Generalizable and Accessible Approach to Machine Learning with Global Satellite Imagery

Figure 3 for A Generalizable and Accessible Approach to Machine Learning with Global Satellite Imagery

Figure 4 for A Generalizable and Accessible Approach to Machine Learning with Global Satellite Imagery

Abstract:Combining satellite imagery with machine learning (SIML) has the potential to address global challenges by remotely estimating socioeconomic and environmental conditions in data-poor regions, yet the resource requirements of SIML limit its accessibility and use. We show that a single encoding of satellite imagery can generalize across diverse prediction tasks (e.g. forest cover, house price, road length). Our method achieves accuracy competitive with deep neural networks at orders of magnitude lower computational cost, scales globally, delivers label super-resolution predictions, and facilitates characterizations of uncertainty. Since image encodings are shared across tasks, they can be centrally computed and distributed to unlimited researchers, who need only fit a linear regression to their own ground truth data in order to achieve state-of-the-art SIML performance.

Via

Access Paper or Ask Questions

Certainty Equivalent Perception-Based Control

Aug 27, 2020

Sarah Dean, Benjamin Recht

Abstract:In order to certify performance and safety, feedback control requires precise characterization of sensor errors. In this paper, we provide guarantees on such feedback systems when sensors are characterized by solving a supervised learning problem. We show a uniform error bound on nonparametric kernel regression under a dynamically-achievable dense sampling scheme. This allows for a finite-time convergence rate on the sub-optimality of using the regressor in closed-loop for waypoint tracking. We demonstrate our results in simulation with simplified unmanned aerial vehicle and autonomous driving examples.

Via

Access Paper or Ask Questions

Measuring Robustness to Natural Distribution Shifts in Image Classification

Jul 01, 2020

Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt

Figure 1 for Measuring Robustness to Natural Distribution Shifts in Image Classification

Figure 2 for Measuring Robustness to Natural Distribution Shifts in Image Classification

Figure 3 for Measuring Robustness to Natural Distribution Shifts in Image Classification

Figure 4 for Measuring Robustness to Natural Distribution Shifts in Image Classification

Abstract:We study how robust current ImageNet models are to distribution shifts arising from natural variations in datasets. Most research on robustness focuses on synthetic image perturbations (noise, simulated weather artifacts, adversarial examples, etc.), which leaves open how robustness on synthetic distribution shift relates to distribution shift arising in real data. Informed by an evaluation of 196 ImageNet models in 211 different test conditions, we find that there is little to no transfer of robustness from current synthetic to natural distribution shift. Moreover, most current techniques provide no robustness to the natural distribution shifts in our testbed. The main exception is training on larger datasets, which in some cases offers small gains in robustness. Our results indicate that distribution shifts arising in real data are currently an open research problem.

Via

Access Paper or Ask Questions

Active Learning for Nonlinear System Identification with Guarantees

Jun 18, 2020

Horia Mania, Michael I. Jordan, Benjamin Recht

Abstract:While the identification of nonlinear dynamical systems is a fundamental building block of model-based reinforcement learning and feedback control, its sample complexity is only understood for systems that either have discrete states and actions or for systems that can be identified from data generated by i.i.d. random inputs. Nonetheless, many interesting dynamical systems have continuous states and actions and can only be identified through a judicious choice of inputs. Motivated by practical settings, we study a class of nonlinear dynamical systems whose state transitions depend linearly on a known feature embedding of state-action pairs. To estimate such systems in finite time identification methods must explore all directions in feature space. We propose an active learning approach that achieves this by repeating three steps: trajectory planning, trajectory tracking, and re-estimation of the system from all available data. We show that our method estimates nonlinear dynamical systems at a parametric rate, similar to the statistical rate of standard linear regression.

* 29 pages

Via

Access Paper or Ask Questions

The Effect of Natural Distribution Shift on Question Answering Models

Apr 29, 2020

John Miller, Karl Krauth, Benjamin Recht, Ludwig Schmidt

Figure 1 for The Effect of Natural Distribution Shift on Question Answering Models

Figure 2 for The Effect of Natural Distribution Shift on Question Answering Models

Figure 3 for The Effect of Natural Distribution Shift on Question Answering Models

Figure 4 for The Effect of Natural Distribution Shift on Question Answering Models

Abstract:We build four new test sets for the Stanford Question Answering Dataset (SQuAD) and evaluate the ability of question-answering systems to generalize to new data. Our first test set is from the original Wikipedia domain and measures the extent to which existing systems overfit the original test set. Despite several years of heavy test set re-use, we find no evidence of adaptive overfitting. The remaining three test sets are constructed from New York Times articles, Reddit posts, and Amazon product reviews and measure robustness to natural distribution shifts. Across a broad range of models, we observe average performance drops of 3.8, 14.0, and 17.4 F1 points, respectively. In contrast, a strong human baseline matches or exceeds the performance of SQuAD models on the original domain and exhibits little to no drop in new domains. Taken together, our results confirm the surprising resilience of the holdout method and emphasize the need to move towards evaluation metrics that incorporate robustness to natural distribution shifts.

Via

Access Paper or Ask Questions

Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information

Mar 12, 2020

Esther Rolf, Michael I. Jordan, Benjamin Recht

Figure 1 for Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information

Figure 2 for Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information

Figure 3 for Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information

Figure 4 for Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information

Abstract:Observational data are often accompanied by natural structural indices, such as time stamps or geographic locations, which are meaningful to prediction tasks but are often discarded. We leverage semantically meaningful indexing data while ensuring robustness to potentially uninformative or misleading indices. We propose a post-estimation smoothing operator as a fast and effective method for incorporating structural index data into prediction. Because the smoothing step is separate from the original predictor, it applies to a broad class of machine learning tasks, with no need to retrain models. Our theoretical analysis details simple conditions under which post-estimation smoothing will improve accuracy over that of the original predictor. Our experiments on large scale spatial and temporal datasets highlight the speed and accuracy of post-estimation smoothing in practice. Together, these results illuminate a novel way to consider and incorporate the natural structure of index variables in machine learning.

* To appear in AISTATS 2020

Via

Access Paper or Ask Questions

Neural Kernels Without Tangents

Mar 05, 2020

Vaishaal Shankar, Alex Fang, Wenshuo Guo, Sara Fridovich-Keil, Ludwig Schmidt, Jonathan Ragan-Kelley, Benjamin Recht

Figure 1 for Neural Kernels Without Tangents

Figure 2 for Neural Kernels Without Tangents

Figure 3 for Neural Kernels Without Tangents

Figure 4 for Neural Kernels Without Tangents

Abstract:We investigate the connections between neural networks and simple building blocks in kernel space. In particular, using well established feature space tools such as direct sum, averaging, and moment lifting, we present an algebra for creating "compositional" kernels from bags of features. We show that these operations correspond to many of the building blocks of "neural tangent kernels (NTK)". Experimentally, we show that there is a correlation in test error between neural network architectures and the associated kernels. We construct a simple neural network architecture using only 3x3 convolutions, 2x2 average pooling, ReLU, and optimized with SGD and MSE loss that achieves 96% accuracy on CIFAR10, and whose corresponding compositional kernel achieves 90% accuracy. We also use our constructions to investigate the relative performance of neural networks, NTKs, and compositional kernels in the small dataset regime. In particular, we find that compositional kernels outperform NTKs and neural networks outperform both kernel methods.

* code used to produce our results can be found at: https://github.com/modestyachts/neural_kernels_code

Via

Access Paper or Ask Questions

Recommendations and User Agency: The Reachability of Collaboratively-Filtered Information

Dec 20, 2019

Sarah Dean, Sarah Rich, Benjamin Recht

Figure 1 for Recommendations and User Agency: The Reachability of Collaboratively-Filtered Information

Figure 2 for Recommendations and User Agency: The Reachability of Collaboratively-Filtered Information

Figure 3 for Recommendations and User Agency: The Reachability of Collaboratively-Filtered Information

Figure 4 for Recommendations and User Agency: The Reachability of Collaboratively-Filtered Information

Abstract:Recommender systems often rely on models which are trained to maximize accuracy in predicting user preferences. When the systems are deployed, these models determine the availability of content and information to different users. The gap between these objectives gives rise to a potential for unintended consequences, contributing to phenomena such as filter bubbles and polarization. In this work, we consider directly the information availability problem through the lens of user recourse. Using ideas of reachability, we propose a computationally efficient audit for top-$N$ linear recommender models. Furthermore, we describe the relationship between model complexity and the effort necessary for users to exert control over their recommendations. We use this insight to provide a novel perspective on the user cold-start problem. Finally, we demonstrate these concepts with an empirical investigation of a state-of-the-art model trained on a widely used movie ratings dataset.

* to appear at FAT* '20

Via

Access Paper or Ask Questions