Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Honglak Lee

University of Michigan, Ann Arbor

Evolving Reinforcement Learning Algorithms

Jan 08, 2021

John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust

Figure 1 for Evolving Reinforcement Learning Algorithms

Figure 2 for Evolving Reinforcement Learning Algorithms

Figure 3 for Evolving Reinforcement Learning Algorithms

Figure 4 for Evolving Reinforcement Learning Algorithms

Abstract:We propose a method for meta-learning reinforcement learning algorithms by searching over the space of computational graphs which compute the loss function for a value-based model-free RL agent to optimize. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. Our method can both learn from scratch and bootstrap off known existing algorithms, like DQN, enabling interpretable modifications which improve performance. Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm. Bootstrapped from DQN, we highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games. The analysis of the learned algorithm behavior shows resemblance to recently proposed RL algorithms that address overestimation in value-based methods.

Via

Access Paper or Ask Questions

Few-shot Sequence Learning with Transformers

Dec 17, 2020

Lajanugen Logeswaran, Ann Lee, Myle Ott, Honglak Lee, Marc'Aurelio Ranzato, Arthur Szlam

Figure 1 for Few-shot Sequence Learning with Transformers

Figure 2 for Few-shot Sequence Learning with Transformers

Figure 3 for Few-shot Sequence Learning with Transformers

Figure 4 for Few-shot Sequence Learning with Transformers

Abstract:Few-shot algorithms aim at learning new tasks provided only a handful of training examples. In this work we investigate few-shot learning in the setting where the data points are sequences of tokens and propose an efficient learning algorithm based on Transformers. In the simplest setting, we append a token to an input sequence which represents the particular task to be undertaken, and show that the embedding of this token can be optimized on the fly given few labeled examples. Our approach does not require complicated changes to the model architecture such as adapter layers nor computing second order derivatives as is currently popular in the meta-learning and few-shot learning literature. We demonstrate our approach on a variety of tasks, and analyze the generalization properties of several model variants and baseline approaches. In particular, we show that compositional task descriptors can improve performance. Experiments show that our approach works at least as well as other methods, while being more computationally efficient.

* NeurIPS Meta-Learning Workshop 2020

Via

Access Paper or Ask Questions

Text-to-Image Generation Grounded by Fine-Grained User Attention

Nov 07, 2020

Jing Yu Koh, Jason Baldridge, Honglak Lee, Yinfei Yang

Figure 1 for Text-to-Image Generation Grounded by Fine-Grained User Attention

Figure 2 for Text-to-Image Generation Grounded by Fine-Grained User Attention

Figure 3 for Text-to-Image Generation Grounded by Fine-Grained User Attention

Figure 4 for Text-to-Image Generation Grounded by Fine-Grained User Attention

Abstract:Localized Narratives is a dataset with detailed natural language descriptions of images paired with mouse traces that provide a sparse, fine-grained visual grounding for phrases. We propose TReCS, a sequential model that exploits this grounding to generate images. TReCS uses descriptions to retrieve segmentation masks and predict object labels aligned with mouse traces. These alignments are used to select and position masks to generate a fully covered segmentation canvas; the final image is produced by a segmentation-to-image generator using this canvas. This multi-step, retrieval-based approach outperforms existing direct text-to-image generation models on both automatic metrics and human evaluations: overall, its generated images are more photo-realistic and better match descriptions.

* To appear in WACV 2021

Via

Access Paper or Ask Questions

What's in a Loss Function for Image Classification?

Oct 30, 2020

Simon Kornblith, Honglak Lee, Ting Chen, Mohammad Norouzi

Figure 1 for What's in a Loss Function for Image Classification?

Figure 2 for What's in a Loss Function for Image Classification?

Figure 3 for What's in a Loss Function for Image Classification?

Figure 4 for What's in a Loss Function for Image Classification?

Abstract:It is common to use the softmax cross-entropy loss to train neural networks on classification datasets where a single class label is assigned to each example. However, it has been shown that modifying softmax cross-entropy with label smoothing or regularizers such as dropout can lead to higher performance. This paper studies a variety of loss functions and output layer regularization strategies on image classification tasks. We observe meaningful differences in model predictions, accuracy, calibration, and out-of-distribution robustness for networks trained with different objectives. However, differences in hidden representations of networks trained with different objectives are restricted to the last few layers; representational similarity reveals no differences among network layers that are not close to the output. We show that all objectives that improve over vanilla softmax loss produce greater class separation in the penultimate layer of the network, which potentially accounts for improved performance on the original task, but results in features that transfer worse to other tasks.

Via

Access Paper or Ask Questions

Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

Oct 28, 2020

Wilka Carvalho, Anthony Liang, Kimin Lee, Sungryull Sohn, Honglak Lee, Richard L. Lewis, Satinder Singh

Figure 1 for Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

Figure 2 for Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

Figure 3 for Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

Figure 4 for Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

Abstract:First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor virtual home-environment pose significant sample-efficiency challenges for reinforcement learning (RL) agents learning from sparse task rewards. To alleviate these challenges, prior work has provided extensive supervision via a combination of reward-shaping, ground-truth object-information, and expert demonstrations. In this work, we show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task during task learning with an object-centric relational RL agent. Our key insight is that learning an object-model that incorporates object-attention into forward prediction provides a dense learning signal for unsupervised representation learning of both objects and their relationships. This, in turn, enables faster policy learning for an object-centric relational RL agent. We demonstrate our agent by introducing a set of challenging object-interaction tasks in the AI2Thor environment where learning with our attentive object-model is key to strong performance. Specifically, we compare our agent and relational RL agents with alternative auxiliary tasks to a relational RL agent equipped with ground-truth object-information, and show that learning with our object-model best closes the performance gap in terms of both learning speed and maximum success rate. Additionally, we find that incorporating object-attention into an object-model's forward predictions is key to learning representations which capture object-category and object-state.

Via

Access Paper or Ask Questions

Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning

Oct 23, 2020

Guangxiang Zhu, Minghao Zhang, Honglak Lee, Chongjie Zhang

Figure 1 for Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning

Figure 2 for Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning

Figure 3 for Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning

Figure 4 for Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning

Abstract:Sample efficiency has been one of the major challenges for deep reinforcement learning. Recently, model-based reinforcement learning has been proposed to address this challenge by performing planning on imaginary trajectories with a learned world model. However, world model learning may suffer from overfitting to training trajectories, and thus model-based value estimation and policy search will be pone to be sucked in an inferior local policy. In this paper, we propose a novel model-based reinforcement learning algorithm, called BrIdging Reality and Dream (BIRD). It maximizes the mutual information between imaginary and real trajectories so that the policy improvement learned from imaginary trajectories can be easily generalized to real trajectories. We demonstrate that our approach improves sample efficiency of model-based planning, and achieves state-of-the-art performance on challenging visual control benchmarks.

* Published on 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

Via

Access Paper or Ask Questions

i-Mix: A Strategy for Regularizing Contrastive Representation Learning

Oct 17, 2020

Kibok Lee, Yian Zhu, Kihyuk Sohn, Chun-Liang Li, Jinwoo Shin, Honglak Lee

Figure 1 for i-Mix: A Strategy for Regularizing Contrastive Representation Learning

Figure 2 for i-Mix: A Strategy for Regularizing Contrastive Representation Learning

Figure 3 for i-Mix: A Strategy for Regularizing Contrastive Representation Learning

Figure 4 for i-Mix: A Strategy for Regularizing Contrastive Representation Learning

Abstract:Contrastive representation learning has shown to be an effective way of learning representations from unlabeled data. However, much progress has been made in vision domains relying on data augmentations carefully designed using domain knowledge. In this work, we propose i-Mix, a simple yet effective regularization strategy for improving contrastive representation learning in both vision and non-vision domains. We cast contrastive learning as training a non-parametric classifier by assigning a unique virtual class to each data in a batch. Then, data instances are mixed in both the input and virtual label spaces, providing more augmented data during training. In experiments, we demonstrate that i-Mix consistently improves the quality of self-supervised representations across domains, resulting in significant performance gains on downstream tasks. Furthermore, we confirm its regularization effect via extensive ablation studies across model and dataset sizes.

Via

Access Paper or Ask Questions

Text as Neural Operator: Image Manipulation by Text Instruction

Aug 12, 2020

Tianhao Zhang, Hung-Yu Tseng, Lu Jiang, Honglak Lee, Irfan Essa, Weilong Yang

Figure 1 for Text as Neural Operator: Image Manipulation by Text Instruction

Figure 2 for Text as Neural Operator: Image Manipulation by Text Instruction

Figure 3 for Text as Neural Operator: Image Manipulation by Text Instruction

Figure 4 for Text as Neural Operator: Image Manipulation by Text Instruction

Abstract:In this paper, we study a new task that allows users to edit an input image using language instructions. In this image generation task, the inputs are a reference image and a text instruction that describes desired modifications to the input image. We propose a GAN-based method to tackle this problem. The key idea is to treat language as neural operators to locally modify the image feature. To this end, our model decomposes the generation process into finding where (spatial region) and how (text operators) to apply modifications. We show that the proposed model performs favorably against recent baselines on three datasets.

Via

Access Paper or Ask Questions

Predictive Information Accelerates Learning in RL

Jul 24, 2020

Kuang-Huei Lee, Ian Fischer, Anthony Liu, Yijie Guo, Honglak Lee, John Canny, Sergio Guadarrama

Figure 1 for Predictive Information Accelerates Learning in RL

Figure 2 for Predictive Information Accelerates Learning in RL

Figure 3 for Predictive Information Accelerates Learning in RL

Figure 4 for Predictive Information Accelerates Learning in RL

Abstract:The Predictive Information is the mutual information between the past and the future, I(X_past; X_future). We hypothesize that capturing the predictive information is useful in RL, since the ability to model what will happen next is necessary for success on many tasks. To test our hypothesis, we train Soft Actor-Critic (SAC) agents from pixels with an auxiliary task that learns a compressed representation of the predictive information of the RL environment dynamics using a contrastive version of the Conditional Entropy Bottleneck (CEB) objective. We refer to these as Predictive Information SAC (PI-SAC) agents. We show that PI-SAC agents can substantially improve sample efficiency over challenging baselines on tasks from the DM Control suite of continuous control environments. We evaluate PI-SAC agents by comparing against uncompressed PI-SAC agents, other compressed and uncompressed agents, and SAC agents directly trained from pixels.

Via

Access Paper or Ask Questions

Understanding and Diagnosing Vulnerability under Adversarial Attacks

Jul 17, 2020

Haizhong Zheng, Ziqi Zhang, Honglak Lee, Atul Prakash

Figure 1 for Understanding and Diagnosing Vulnerability under Adversarial Attacks

Figure 2 for Understanding and Diagnosing Vulnerability under Adversarial Attacks

Figure 3 for Understanding and Diagnosing Vulnerability under Adversarial Attacks

Figure 4 for Understanding and Diagnosing Vulnerability under Adversarial Attacks

Abstract:Deep Neural Networks (DNNs) are known to be vulnerable to adversarial attacks. Currently, there is no clear insight into how slight perturbations cause such a large difference in classification results and how we can design a more robust model architecture. In this work, we propose a novel interpretability method, InterpretGAN, to generate explanations for features used for classification in latent variables. Interpreting the classification process of adversarial examples exposes how adversarial perturbations influence features layer by layer as well as which features are modified by perturbations. Moreover, we design the first diagnostic method to quantify the vulnerability contributed by each layer, which can be used to identify vulnerable parts of model architectures. The diagnostic results show that the layers introducing more information loss tend to be more vulnerable than other layers. Based on the findings, our evaluation results on MNIST and CIFAR10 datasets suggest that average pooling layers, with lower information loss, are more robust than max pooling layers for the network architectures studied in this paper.

Via

Access Paper or Ask Questions