Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ying Nian Wu

HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving

Feb 22, 2021
Sirui Xie, Xiaojian Ma, Peiyu Yu, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

Figure 1 for HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving

Figure 2 for HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving

Figure 3 for HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving

Humans learn compositional and causal abstraction, \ie, knowledge, in response to the structure of naturalistic tasks. When presented with a problem-solving task involving some objects, toddlers would first interact with these objects to reckon what they are and what can be done with them. Leveraging these concepts, they could understand the internal structure of this task, without seeing all of the problem instances. Remarkably, they further build cognitively executable strategies to \emph{rapidly} solve novel problems. To empower a learning agent with similar capability, we argue there shall be three levels of generalization in how an agent represents its knowledge: perceptual, conceptual, and algorithmic. In this paper, we devise the very first systematic benchmark that offers joint evaluation covering all three levels. This benchmark is centered around a novel task domain, HALMA, for visual concept development and rapid problem-solving. Uniquely, HALMA has a minimum yet complete concept space, upon which we introduce a novel paradigm to rigorously diagnose and dissect learning agents' capability in understanding and generalizing complex and structural concepts. We conduct extensive experiments on reinforcement learning agents with various inductive biases and carefully report their proficiency and weakness.

Via

Access Paper or Ask Questions

Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis and Analysis

Dec 25, 2020
Jianwen Xie, Zilong Zheng, Ruiqi Gao, Wenguan Wang, Song-Chun Zhu, Ying Nian Wu

Figure 1 for Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis and Analysis

Figure 2 for Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis and Analysis

Figure 3 for Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis and Analysis

Figure 4 for Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis and Analysis

3D data that contains rich geometry information of objects and scenes is valuable for understanding 3D physical world. With the recent emergence of large-scale 3D datasets, it becomes increasingly crucial to have a powerful 3D generative model for 3D shape synthesis and analysis. This paper proposes a deep 3D energy-based model to represent volumetric shapes. The maximum likelihood training of the model follows an "analysis by synthesis" scheme. The benefits of the proposed model are six-fold: first, unlike GANs and VAEs, the model training does not rely on any auxiliary models; second, the model can synthesize realistic 3D shapes by Markov chain Monte Carlo (MCMC); third, the conditional model can be applied to 3D object recovery and super resolution; fourth, the model can serve as a building block in a multi-grid modeling and sampling framework for high resolution 3D shape synthesis; fifth, the model can be used to train a 3D generator via MCMC teaching; sixth, the unsupervisedly trained model provides a powerful feature extractor for 3D data, which is useful for 3D object classification. Experiments demonstrate that the proposed model can generate high-quality 3D shape patterns and can be useful for a wide variety of 3D shape analysis.

* 16 pages. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2020. arXiv admin note: substantial text overlap with arXiv:1804.00586

Via

Access Paper or Ask Questions

Learning Energy-Based Models by Diffusion Recovery Likelihood

Dec 15, 2020
Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P. Kingma

Figure 1 for Learning Energy-Based Models by Diffusion Recovery Likelihood

Figure 2 for Learning Energy-Based Models by Diffusion Recovery Likelihood

Figure 3 for Learning Energy-Based Models by Diffusion Recovery Likelihood

Figure 4 for Learning Energy-Based Models by Diffusion Recovery Likelihood

While energy-based models (EBMs) exhibit a number of desirable properties, training and sampling on high-dimensional datasets remains challenging. Inspired by recent progress on diffusion probabilistic models, we present a diffusion recovery likelihood method to tractably learn and sample from a sequence of EBMs trained on increasingly noisy versions of a dataset. Each EBM is trained by maximizing the recovery likelihood: the conditional probability of the data at a certain noise level given their noisy versions at a higher noise level. The recovery likelihood objective is more tractable than the marginal likelihood objective, since it only requires MCMC sampling from a relatively concentrated conditional distribution. Moreover, we show that this estimation method is theoretically consistent: it learns the correct conditional and marginal distributions at each noise level, given sufficient data. After training, synthesized images can be generated efficiently by a sampling process that initializes from a spherical Gaussian distribution and progressively samples the conditional distributions at decreasingly lower noise levels. Our method generates high fidelity samples on various image datasets. On unconditional CIFAR-10 our method achieves FID 9.60 and inception score 8.58, superior to the majority of GANs. Moreover, we demonstrate that unlike previous work on EBMs, our long-run MCMC samples from the conditional distributions do not diverge and still represent realistic images, allowing us to accurately estimate the normalized density of data even for high-dimensional datasets.

Via

Access Paper or Ask Questions

Semi-supervised Learning by Latent Space Energy-Based Model of Symbol-Vector Coupling

Oct 19, 2020
Bo Pang, Erik Nijkamp, Jiali Cui, Tian Han, Ying Nian Wu

Figure 1 for Semi-supervised Learning by Latent Space Energy-Based Model of Symbol-Vector Coupling

Figure 2 for Semi-supervised Learning by Latent Space Energy-Based Model of Symbol-Vector Coupling

Figure 3 for Semi-supervised Learning by Latent Space Energy-Based Model of Symbol-Vector Coupling

Figure 4 for Semi-supervised Learning by Latent Space Energy-Based Model of Symbol-Vector Coupling

This paper proposes a latent space energy-based prior model for semi-supervised learning. The model stands on a generator network that maps a latent vector to the observed example. The energy term of the prior model couples the latent vector and a symbolic one-hot vector, so that classification can be based on the latent vector inferred from the observed example. In our learning method, the symbol-vector coupling, the generator network and the inference network are learned jointly. Our method is applicable to semi-supervised learning in various data domains such as image, text, and tabular data. Our experiments demonstrate that our method performs well on semi-supervised learning tasks.

* work in progress

Via

Access Paper or Ask Questions

Learning Latent Space Energy-Based Prior Model for Molecule Generation

Oct 19, 2020
Bo Pang, Tian Han, Ying Nian Wu

Figure 1 for Learning Latent Space Energy-Based Prior Model for Molecule Generation

Figure 2 for Learning Latent Space Energy-Based Prior Model for Molecule Generation

Deep generative models have recently been applied to molecule design. If the molecules are encoded in linear SMILES strings, modeling becomes convenient. However, models relying on string representations tend to generate invalid samples and duplicates. Prior work addressed these issues by building models on chemically-valid fragments or explicitly enforcing chemical rules in the generation process. We argue that an expressive model is sufficient to implicitly and automatically learn the complicated chemical rules from the data, even if molecules are encoded in simple character-level SMILES strings. We propose to learn latent space energy-based prior model with SMILES representation for molecule modeling. Our experiments show that our method is able to generate molecules with validity and uniqueness competitive with state-of-the-art models. Interestingly, generated molecules have structural and chemical features whose distributions almost perfectly match those of the real molecules.

Via

Access Paper or Ask Questions

A Representational Model of Grid Cells Based on Matrix Lie Algebras

Jun 18, 2020
Ruiqi Gao, Jianwen Xie, Song-Chun Zhu, Ying Nian Wu

Figure 1 for A Representational Model of Grid Cells Based on Matrix Lie Algebras

Figure 2 for A Representational Model of Grid Cells Based on Matrix Lie Algebras

Figure 3 for A Representational Model of Grid Cells Based on Matrix Lie Algebras

Figure 4 for A Representational Model of Grid Cells Based on Matrix Lie Algebras

The grid cells in the mammalian medial entorhinal cortex exhibit striking hexagon firing patterns when the agent navigates in the open field. It is hypothesized that the grid cells are involved in path integral so that the agent is aware of its self-position by accumulating its self-motion. Assuming the grid cells form a vector representation of self-position, we elucidate a minimally simple recurrent model for path integral, which models the change of the vector representation given the self-motion, and we discern two matrix Lie algebras and their Lie groups that are naturally coupled together. This enables us to connect the path integral model to the dimension reduction model for place cells via group representation theory of harmonic analysis. By reconstructing the kernel functions for place cells, our model learns hexagon grid patterns that characterize the grid cells. The learned model is capable of near perfect path integral, and it is also capable of error correction.

Via

Access Paper or Ask Questions

Learning Latent Space Energy-Based Prior Model

Jun 15, 2020
Bo Pang, Tian Han, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu

Figure 1 for Learning Latent Space Energy-Based Prior Model

Figure 2 for Learning Latent Space Energy-Based Prior Model

Figure 3 for Learning Latent Space Energy-Based Prior Model

Figure 4 for Learning Latent Space Energy-Based Prior Model

The generator model assumes that the observed example is generated by a low-dimensional latent vector via a top-down network, and the latent vector follows a simple and known prior distribution, such as uniform or Gaussian white noise distribution. While we can learn an expressive top-down network to map the prior distribution to the data distribution, we can also learn an expressive prior model instead of assuming a given prior distribution. This follows the philosophy of empirical Bayes where the prior model is learned from the observed data. We propose to learn an energy-based prior model for the latent vector, where the energy function is parametrized by a very simple multi-layer perceptron. Due to the low-dimensionality of the latent space, learning a latent space energy-based prior model proves to be both feasible and desirable. In this paper, we develop the maximum likelihood learning algorithm and its variation based on short-run Markov chain Monte Carlo sampling from the prior and the posterior distributions of the latent vector, and we show that the learned model exhibits strong performance in terms of image and text generation and anomaly detection.

* 22 pages, 4 figures

Via

Access Paper or Ask Questions

Learning Energy-based Model with Flow-based Backbone by Neural Transport MCMC

Jun 12, 2020
Erik Nijkamp, Ruiqi Gao, Pavel Sountsov, Srinivas Vasudevan, Bo Pang, Song-Chun Zhu, Ying Nian Wu

Figure 1 for Learning Energy-based Model with Flow-based Backbone by Neural Transport MCMC

Figure 2 for Learning Energy-based Model with Flow-based Backbone by Neural Transport MCMC

Figure 3 for Learning Energy-based Model with Flow-based Backbone by Neural Transport MCMC

Figure 4 for Learning Energy-based Model with Flow-based Backbone by Neural Transport MCMC

Learning energy-based model (EBM) requires MCMC sampling of the learned model as the inner loop of the learning algorithm. However, MCMC sampling of EBM in data space is generally not mixing, because the energy function, which is usually parametrized by deep network, is highly multi-modal in the data space. This is a serious handicap for both the theory and practice of EBM. In this paper, we propose to learn EBM with a flow-based model serving as a backbone, so that the EBM is a correction or an exponential tilting of the flow-based model. We show that the model has a particularly simple form in the space of the latent variables of the flow-based model, and MCMC sampling of the EBM in the latent space, which is a simple special case of neural transport MCMC, mixes well and traverses modes in the data space. This enables proper sampling and learning of EBM.

Via

Access Paper or Ask Questions

Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

Jun 11, 2020
Qing Li, Siyuan Huang, Yining Hong, Yixin Chen, Ying Nian Wu, Song-Chun Zhu

Figure 1 for Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

Figure 2 for Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

Figure 3 for Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

Figure 4 for Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

The goal of neural-symbolic computation is to integrate the connectionist and symbolist paradigms. Prior methods learn the neural-symbolic models using reinforcement learning (RL) approaches, which ignore the error propagation in the symbolic reasoning module and thus converge slowly with sparse rewards. In this paper, we address these issues and close the loop of neural-symbolic learning by (1) introducing the \textbf{grammar} model as a \textit{symbolic prior} to bridge neural perception and symbolic reasoning, and (2) proposing a novel \textbf{back-search} algorithm which mimics the top-down human-like learning procedure to propagate the error through the symbolic reasoning module efficiently. We further interpret the proposed learning framework as maximum likelihood estimation using Markov chain Monte Carlo sampling and the back-search algorithm as a Metropolis-Hastings sampler. The experiments are conducted on two weakly-supervised neural-symbolic tasks: (1) handwritten formula recognition on the newly introduced HWF dataset; (2) visual question answering on the CLEVR dataset. The results show that our approach significantly outperforms the RL methods in terms of performance, converging speed, and data efficiency. Our code and data are released at \url{https://liqing-ustc.github.io/NGS}.

* ICML 2020. Project page: https://liqing-ustc.github.io/NGS

Via

Access Paper or Ask Questions

Joint Training of Variational Auto-Encoder and Latent Energy-Based Model

Jun 10, 2020
Tian Han, Erik Nijkamp, Linqi Zhou, Bo Pang, Song-Chun Zhu, Ying Nian Wu

Figure 1 for Joint Training of Variational Auto-Encoder and Latent Energy-Based Model

Figure 2 for Joint Training of Variational Auto-Encoder and Latent Energy-Based Model

Figure 3 for Joint Training of Variational Auto-Encoder and Latent Energy-Based Model

Figure 4 for Joint Training of Variational Auto-Encoder and Latent Energy-Based Model

This paper proposes a joint training method to learn both the variational auto-encoder (VAE) and the latent energy-based model (EBM). The joint training of VAE and latent EBM are based on an objective function that consists of three Kullback-Leibler divergences between three joint distributions on the latent vector and the image, and the objective function is of an elegant symmetric and anti-symmetric form of divergence triangle that seamlessly integrates variational and adversarial learning. In this joint training scheme, the latent EBM serves as a critic of the generator model, while the generator model and the inference model in VAE serve as the approximate synthesis sampler and inference sampler of the latent EBM. Our experiments show that the joint training greatly improves the synthesis quality of the VAE. It also enables learning of an energy function that is capable of detecting out of sample examples for anomaly detection.

Via

Access Paper or Ask Questions