Alert button
Picture for Vlad Sobal

Vlad Sobal

Alert button

A Cookbook of Self-Supervised Learning

Apr 24, 2023
Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann LeCun, Micah Goldblum

Figure 1 for A Cookbook of Self-Supervised Learning
Figure 2 for A Cookbook of Self-Supervised Learning
Figure 3 for A Cookbook of Self-Supervised Learning
Figure 4 for A Cookbook of Self-Supervised Learning

Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning. Yet, much like cooking, training SSL methods is a delicate art with a high barrier to entry. While many components are familiar, successfully training a SSL method involves a dizzying set of choices from the pretext tasks to training hyper-parameters. Our goal is to lower the barrier to entry into SSL research by laying the foundations and latest SSL recipes in the style of a cookbook. We hope to empower the curious researcher to navigate the terrain of methods, understand the role of the various knobs, and gain the know-how required to explore how delicious SSL can be.

Viaarxiv icon

Joint Embedding Predictive Architectures Focus on Slow Features

Nov 20, 2022
Vlad Sobal, Jyothir S V, Siddhartha Jalagam, Nicolas Carion, Kyunghyun Cho, Yann LeCun

Figure 1 for Joint Embedding Predictive Architectures Focus on Slow Features
Figure 2 for Joint Embedding Predictive Architectures Focus on Slow Features
Figure 3 for Joint Embedding Predictive Architectures Focus on Slow Features
Figure 4 for Joint Embedding Predictive Architectures Focus on Slow Features

Many common methods for learning a world model for pixel-based environments use generative architectures trained with pixel-level reconstruction objectives. Recently proposed Joint Embedding Predictive Architectures (JEPA) offer a reconstruction-free alternative. In this work, we analyze performance of JEPA trained with VICReg and SimCLR objectives in the fully offline setting without access to rewards, and compare the results to the performance of the generative architecture. We test the methods in a simple environment with a moving dot with various background distractors, and probe learned representations for the dot's location. We find that JEPA methods perform on par or better than reconstruction when distractor noise changes every time step, but fail when the noise is fixed. Furthermore, we provide a theoretical explanation for the poor performance of JEPA-based methods with fixed noise, highlighting an important limitation.

* 4 pages (3 figures) short paper for SSL Theory and Practice workshop at NeurIPS 2022. Code is available at https://github.com/vladisai/JEPA_SSL_NeurIPS_2022 
Viaarxiv icon

Light-weight probing of unsupervised representations for Reinforcement Learning

Aug 25, 2022
Wancong Zhang, Anthony GX-Chen, Vlad Sobal, Yann LeCun, Nicolas Carion

Figure 1 for Light-weight probing of unsupervised representations for Reinforcement Learning
Figure 2 for Light-weight probing of unsupervised representations for Reinforcement Learning
Figure 3 for Light-weight probing of unsupervised representations for Reinforcement Learning
Figure 4 for Light-weight probing of unsupervised representations for Reinforcement Learning

Unsupervised visual representation learning offers the opportunity to leverage large corpora of unlabeled trajectories to form useful visual representations, which can benefit the training of reinforcement learning (RL) algorithms. However, evaluating the fitness of such representations requires training RL algorithms which is computationally intensive and has high variance outcomes. To alleviate this issue, we design an evaluation protocol for unsupervised RL representations with lower variance and up to 600x lower computational cost. Inspired by the vision community, we propose two linear probing tasks: predicting the reward observed in a given state, and predicting the action of an expert in a given state. These two tasks are generally applicable to many RL domains, and we show through rigorous experimentation that they correlate strongly with the actual downstream control performance on the Atari100k Benchmark. This provides a better method for exploring the space of pretraining algorithms without the need of running RL evaluations for every setting. Leveraging this framework, we further improve existing self-supervised learning (SSL) recipes for RL, highlighting the importance of the forward model, the size of the visual backbone, and the precise formulation of the unsupervised objective.

Viaarxiv icon

Separating the World and Ego Models for Self-Driving

Apr 14, 2022
Vlad Sobal, Alfredo Canziani, Nicolas Carion, Kyunghyun Cho, Yann LeCun

Figure 1 for Separating the World and Ego Models for Self-Driving
Figure 2 for Separating the World and Ego Models for Self-Driving
Figure 3 for Separating the World and Ego Models for Self-Driving
Figure 4 for Separating the World and Ego Models for Self-Driving

Training self-driving systems to be robust to the long-tail of driving scenarios is a critical problem. Model-based approaches leverage simulation to emulate a wide range of scenarios without putting users at risk in the real world. One promising path to faithful simulation is to train a forward model of the world to predict the future states of both the environment and the ego-vehicle given past states and a sequence of actions. In this paper, we argue that it is beneficial to model the state of the ego-vehicle, which often has simple, predictable and deterministic behavior, separately from the rest of the environment, which is much more complex and highly multimodal. We propose to model the ego-vehicle using a simple and differentiable kinematic model, while training a stochastic convolutional forward model on raster representations of the state to predict the behavior of the rest of the environment. We explore several configurations of such decoupled models, and evaluate their performance both with Model Predictive Control (MPC) and direct policy learning. We test our methods on the task of highway driving and demonstrate lower crash rates and better stability. The code is available at https://github.com/vladisai/pytorch-PPUU/tree/ICLR2022.

* 8 pages main content, 14 with references and appendix. 5 figures in total. Submitted and accepted to ICLR 2022 workshop on Generalizable Policy Learning in the Physical World (https://ai-workshops.github.io/generalizable-policy-learning-in-the-physical-world/) 
Viaarxiv icon