Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Honglak Lee

University of Michigan, Ann Arbor

Improved Consistency Regularization for GANs

Feb 11, 2020

Zhengli Zhao, Sameer Singh, Honglak Lee, Zizhao Zhang, Augustus Odena, Han Zhang

Figure 1 for Improved Consistency Regularization for GANs

Figure 2 for Improved Consistency Regularization for GANs

Figure 3 for Improved Consistency Regularization for GANs

Figure 4 for Improved Consistency Regularization for GANs

Abstract:Recent work has increased the performance of Generative Adversarial Networks (GANs) by enforcing a consistency cost on the discriminator. We improve on this technique in several ways. We first show that consistency regularization can introduce artifacts into the GAN samples and explain how to fix this issue. We then propose several modifications to the consistency regularization procedure designed to improve its performance. We carry out extensive experiments quantifying the benefit of our improvements. For unconditional image synthesis on CIFAR-10 and CelebA, our modifications yield the best known FID scores on various GAN architectures. For conditional image synthesis on CIFAR-10, we improve the state-of-the-art FID score from 11.48 to 9.21. Finally, on ImageNet-2012, we apply our technique to the original BigGAN model and improve the FID from 6.66 to 5.38, which is the best score at that model size.

* Augustus Odena and Han Zhang contributed equally

Via

Access Paper or Ask Questions

BRPO: Batch Residual Policy Optimization

Feb 08, 2020

Sungryull Sohn, Yinlam Chow, Jayden Ooi, Ofir Nachum, Honglak Lee, Ed Chi, Craig Boutilier

Figure 1 for BRPO: Batch Residual Policy Optimization

Figure 2 for BRPO: Batch Residual Policy Optimization

Figure 3 for BRPO: Batch Residual Policy Optimization

Figure 4 for BRPO: Batch Residual Policy Optimization

Abstract:In batch reinforcement learning (RL), one often constrains a learned policy to be close to the behavior (data-generating) policy, e.g., by constraining the learned action distribution to differ from the behavior policy by some maximum degree that is the same at each state. This can cause batch RL to be overly conservative, unable to exploit large policy changes at frequently-visited, high-confidence states without risking poor performance at sparsely-visited states. To remedy this, we propose residual policies, where the allowable deviation of the learned policy is state-action-dependent. We derive a new for RL method, BRPO, which learns both the policy and allowable deviation that jointly maximize a lower bound on policy performance. We show that BRPO achieves the state-of-the-art performance in a number of tasks.

Via

Access Paper or Ask Questions

High-Fidelity Synthesis with Disentangled Representation

Jan 13, 2020

Wonkwang Lee, Donggyun Kim, Seunghoon Hong, Honglak Lee

Figure 1 for High-Fidelity Synthesis with Disentangled Representation

Figure 2 for High-Fidelity Synthesis with Disentangled Representation

Figure 3 for High-Fidelity Synthesis with Disentangled Representation

Figure 4 for High-Fidelity Synthesis with Disentangled Representation

Abstract:Learning disentangled representation of data without supervision is an important step towards improving the interpretability of generative models. Despite recent advances in disentangled representation learning, existing approaches often suffer from the trade-off between representation learning and generation performance i.e. improving generation quality sacrifices disentanglement performance). We propose an Information-Distillation Generative Adversarial Network (ID-GAN), a simple yet generic framework that easily incorporates the existing state-of-the-art models for both disentanglement learning and high-fidelity synthesis. Our method learns disentangled representation using VAE-based models, and distills the learned representation with an additional nuisance variable to the separate GAN-based generator for high-fidelity synthesis. To ensure that both generative models are aligned to render the same generative factors, we further constrain the GAN generator to maximize the mutual information between the learned latent code and the output. Despite the simplicity, we show that the proposed method is highly effective, achieving comparable image generation quality to the state-of-the-art methods using the disentangled representation. We also show that the proposed decomposition leads to an efficient and stable model design, and we demonstrate photo-realistic high-resolution image synthesis results (1024x1024 pixels) for the first time using the disentangled representations.

Via

Access Paper or Ask Questions

Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies

Jan 01, 2020

Sungryull Sohn, Hyunjae Woo, Jongwook Choi, Honglak Lee

Figure 1 for Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies

Figure 2 for Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies

Figure 3 for Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies

Figure 4 for Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies

Abstract:We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph which describes a set of subtasks and their dependencies that are unknown to the agent. The agent needs to quickly adapt to the task over few episodes during adaptation phase to maximize the return in the test phase. Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference(MSGI), which infers the latent parameter of the task by interacting with the environment and maximizes the return given the latent parameter. To facilitate learning, we adopt an intrinsic reward inspired by upper confidence bound (UCB) that encourages efficient exploration. Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter, and to adapt more efficiently than existing meta RL and hierarchical RL methods.

* In ICLR 2020

Via

Access Paper or Ask Questions

Efficient Adversarial Training with Transferable Adversarial Examples

Dec 27, 2019

Haizhong Zheng, Ziqi Zhang, Juncheng Gu, Honglak Lee, Atul Prakash

Figure 1 for Efficient Adversarial Training with Transferable Adversarial Examples

Figure 2 for Efficient Adversarial Training with Transferable Adversarial Examples

Figure 3 for Efficient Adversarial Training with Transferable Adversarial Examples

Figure 4 for Efficient Adversarial Training with Transferable Adversarial Examples

Abstract:Adversarial training is an effective defense method to protect classification models against adversarial attacks. However, one limitation of this approach is that it can require orders of magnitude additional training time due to high cost of generating strong adversarial examples during training. In this paper, we first show that there is high transferability between models from neighboring epochs in the same training process, i.e., adversarial examples from one epoch continue to be adversarial in subsequent epochs. Leveraging this property, we propose a novel method, Adversarial Training with Transferable Adversarial Examples (ATTA), that can enhance the robustness of trained models and greatly improve the training efficiency by accumulating adversarial perturbations through epochs. Compared to state-of-the-art adversarial training methods, ATTA enhances adversarial accuracy by up to 7.2% on CIFAR10 and requires 12~14x less training time on MNIST and CIFAR10 datasets with comparable model robustness.

Via

Access Paper or Ask Questions

How Should an Agent Practice?

Dec 15, 2019

Janarthanan Rajendran, Richard Lewis, Vivek Veeriah, Honglak Lee, Satinder Singh

Figure 1 for How Should an Agent Practice?

Figure 2 for How Should an Agent Practice?

Figure 3 for How Should an Agent Practice?

Figure 4 for How Should an Agent Practice?

Abstract:We present a method for learning intrinsic reward functions to drive the learning of an agent during periods of practice in which extrinsic task rewards are not available. During practice, the environment may differ from the one available for training and evaluation with extrinsic rewards. We refer to this setup of alternating periods of practice and objective evaluation as practice-match, drawing an analogy to regimes of skill acquisition common for humans in sports and games. The agent must effectively use periods in the practice environment so that performance improves during matches. In the proposed method the intrinsic practice reward is learned through a meta-gradient approach that adapts the practice reward parameters to reduce the extrinsic match reward loss computed from matches. We illustrate the method on a simple grid world, and evaluate it in two games in which the practice environment differs from match: Pong with practice against a wall without an opponent, and PacMan with practice in a maze without ghosts. The results show gains from learning in practice in addition to match periods over learning in matches only.

* AAAI-2020

Via

Access Paper or Ask Questions

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Nov 05, 2019

Ruben Villegas, Arkanath Pathak, Harini Kannan, Dumitru Erhan, Quoc V. Le, Honglak Lee

Figure 1 for High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Figure 2 for High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Figure 3 for High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Figure 4 for High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Abstract:Predicting future video frames is extremely challenging, as there are many factors of variation that make up the dynamics of how frames change through time. Previously proposed solutions require complex inductive biases inside network architectures with highly specialized computation, including segmentation masks, optical flow, and foreground and background separation. In this work, we question if such handcrafted architectures are necessary and instead propose a different approach: finding minimal inductive bias for video prediction while maximizing network capacity. We investigate this question by performing the first large-scale empirical study and demonstrate state-of-the-art performance by learning large models on three different datasets: one for modeling object interactions, one for modeling human motion, and one for modeling car driving.

* In Advances in Neural Information Processing Systems (NeurIPS), 2019

Via

Access Paper or Ask Questions

Consistency Regularization for Generative Adversarial Networks

Oct 26, 2019

Han Zhang, Zizhao Zhang, Augustus Odena, Honglak Lee

Figure 1 for Consistency Regularization for Generative Adversarial Networks

Figure 2 for Consistency Regularization for Generative Adversarial Networks

Figure 3 for Consistency Regularization for Generative Adversarial Networks

Figure 4 for Consistency Regularization for Generative Adversarial Networks

Abstract:Generative Adversarial Networks (GANs) are known to be difficult to train, despite considerable research effort. Several regularization techniques for stabilizing training have been proposed, but they introduce non-trivial computational overheads and interact poorly with existing techniques like spectral normalization. In this work, we propose a simple, effective training stabilizer based on the notion of consistency regularization---a popular technique in the semi-supervised learning literature. In particular, we augment data passing into the GAN discriminator and penalize the sensitivity of the discriminator to these augmentations. We conduct a series of experiments to demonstrate that consistency regularization works effectively with spectral normalization and various GAN architectures, loss functions and optimizer settings. Our method achieves the best FID scores for unconditional image generation compared to other regularization methods on CIFAR-10 and CelebA. Moreover, Our consistency regularized GAN (CR-GAN) improves state-of-the-art FID scores for conditional generation from 14.73 to 11.67 on CIFAR-10 and from 8.73 to 6.66 on ImageNet-2012.

Via

Access Paper or Ask Questions

IEG: Robust Neural Network Training to Tackle Severe Label Noise

Oct 13, 2019

Zizhao Zhang, Han Zhang, Sercan O. Arik, Honglak Lee, Tomas Pfister

Figure 1 for IEG: Robust Neural Network Training to Tackle Severe Label Noise

Figure 2 for IEG: Robust Neural Network Training to Tackle Severe Label Noise

Figure 3 for IEG: Robust Neural Network Training to Tackle Severe Label Noise

Figure 4 for IEG: Robust Neural Network Training to Tackle Severe Label Noise

Abstract:Collecting large-scale data with clean labels for supervised training of neural networks is practically challenging. Although noisy labels are usually cheap to acquire, existing methods suffer severely for training datasets with high noise ratios, making high-cost human labeling a necessity. Here we present a method to train neural networks in a way that is almost invulnerable to severe label noise by utilizing a tiny trusted set. Our method, named IEG, is based on three key insights: (i) Isolation of noisy labels, (ii) Escalation of useful supervision from mislabeled data, and (iii) Guidance from small trusted data. On CIFAR100 with a 40% uniform noise ratio and 10 trusted labeled data per class, our method achieves $80.2{\pm}0.3\%$ classification accuracy, only 1.4% higher error than a neural network trained without label noise. Moreover, increasing the noise ratio to 80%, our method still achieves a high accuracy of $75.5{\pm}0.2\%$, compared to the previous best 47.7%. Finally, our method sets new state of the art on various types of challenging label corruption types and levels and large-scale WebVision benchmarks.

* v1: first committed preprint, v2: remove small typos in text and figures

Via

Access Paper or Ask Questions

A Simple Randomization Technique for Generalization in Deep Reinforcement Learning

Oct 11, 2019

Kimin Lee, Kibok Lee, Jinwoo Shin, Honglak Lee

Figure 1 for A Simple Randomization Technique for Generalization in Deep Reinforcement Learning

Figure 2 for A Simple Randomization Technique for Generalization in Deep Reinforcement Learning

Figure 3 for A Simple Randomization Technique for Generalization in Deep Reinforcement Learning

Figure 4 for A Simple Randomization Technique for Generalization in Deep Reinforcement Learning

Abstract:Deep reinforcement learning (RL) agents often fail to generalize to unseen environments (yet semantically similar to trained agents), particularly when they are trained on high-dimensional state spaces, such as images. In this paper, we propose a simple technique to improve a generalization ability of deep RL agents by introducing a randomized (convolutional) neural network that randomly perturbs input observations. It enables trained agents to adapt to new domains by learning robust features invariant across varied and randomized environments. Furthermore, we consider an inference method based on the Monte Carlo approximation to reduce the variance induced by this randomization. We demonstrate the superiority of our method across 2D CoinRun, 3D DeepMind Lab exploration and 3D robotics control tasks: it significantly outperforms various regularization and data augmentation methods for the same purpose.

* In NeurIPS Workshop on Deep RL, 2019 / First two authors are equally contributed

Via

Access Paper or Ask Questions