Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jun Zhu

Tsinghua University

KO-PDE: Kernel Optimized Discovery of Partial Differential Equations with Varying Coefficients

Jun 02, 2021

Yingtao Luo, Qiang Liu, Yuntian Chen, Wenbo Hu, Jun Zhu

Figure 1 for KO-PDE: Kernel Optimized Discovery of Partial Differential Equations with Varying Coefficients

Figure 2 for KO-PDE: Kernel Optimized Discovery of Partial Differential Equations with Varying Coefficients

Figure 3 for KO-PDE: Kernel Optimized Discovery of Partial Differential Equations with Varying Coefficients

Figure 4 for KO-PDE: Kernel Optimized Discovery of Partial Differential Equations with Varying Coefficients

Abstract:Partial differential equations (PDEs) fitting scientific data can represent physical laws with explainable mechanisms for various mathematically-oriented subjects. Most natural dynamics are expressed by PDEs with varying coefficients (PDEs-VC), which highlights the importance of PDE discovery. Previous algorithms can discover some simple instances of PDEs-VC but fail in the discovery of PDEs with coefficients of higher complexity, as a result of coefficient estimation inaccuracy. In this paper, we propose KO-PDE, a kernel optimized regression method that incorporates the kernel density estimation of adjacent coefficients to reduce the coefficient estimation error. KO-PDE can discover PDEs-VC on which previous baselines fail and is more robust against inevitable noise in data. In experiments, the PDEs-VC of seven challenging spatiotemporal scientific datasets in fluid dynamics are all discovered by KO-PDE, while the three baselines render false results in most cases. With state-of-the-art performance, KO-PDE sheds light on the automatic description of natural phenomenons using discovered PDEs in the real world.

* Preprint. Under review

Via

Access Paper or Ask Questions

Adversarial Training with Rectified Rejection

May 31, 2021

Tianyu Pang, Huishuai Zhang, Di He, Yinpeng Dong, Hang Su, Wei Chen, Jun Zhu, Tie-Yan Liu

Figure 1 for Adversarial Training with Rectified Rejection

Figure 2 for Adversarial Training with Rectified Rejection

Figure 3 for Adversarial Training with Rectified Rejection

Figure 4 for Adversarial Training with Rectified Rejection

Abstract:Adversarial training (AT) is one of the most effective strategies for promoting model robustness, whereas even the state-of-the-art adversarially trained models struggle to exceed 60% robust test accuracy on CIFAR-10 without additional data, which is far from practical. A natural way to break this accuracy bottleneck is to introduce a rejection option, where confidence is a commonly used certainty proxy. However, the vanilla confidence can overestimate the model certainty if the input is wrongly classified. To this end, we propose to use true confidence (T-Con) (i.e., predicted probability of the true class) as a certainty oracle, and learn to predict T-Con by rectifying confidence. We prove that under mild conditions, a rectified confidence (R-Con) rejector and a confidence rejector can be coupled to distinguish any wrongly classified input from correctly classified ones, even under adaptive attacks. We also quantify that training R-Con to be aligned with T-Con could be an easier task than learning robust classifiers. In our experiments, we evaluate our rectified rejection (RR) module on CIFAR-10, CIFAR-10-C, and CIFAR-100 under several attacks, and demonstrate that the RR module is well compatible with different AT frameworks on improving robustness, with little extra computation.

Via

Access Paper or Ask Questions

Unsupervised Part Segmentation through Disentangling Appearance and Shape

May 26, 2021

Shilong Liu, Lei Zhang, Xiao Yang, Hang Su, Jun Zhu

Figure 1 for Unsupervised Part Segmentation through Disentangling Appearance and Shape

Figure 2 for Unsupervised Part Segmentation through Disentangling Appearance and Shape

Figure 3 for Unsupervised Part Segmentation through Disentangling Appearance and Shape

Figure 4 for Unsupervised Part Segmentation through Disentangling Appearance and Shape

Abstract:We study the problem of unsupervised discovery and segmentation of object parts, which, as an intermediate local representation, are capable of finding intrinsic object structure and providing more explainable recognition results. Recent unsupervised methods have greatly relaxed the dependency on annotated data which are costly to obtain, but still rely on additional information such as object segmentation mask or saliency map. To remove such a dependency and further improve the part segmentation performance, we develop a novel approach by disentangling the appearance and shape representations of object parts followed with reconstruction losses without using additional object mask information. To avoid degenerated solutions, a bottleneck block is designed to squeeze and expand the appearance representation, leading to a more effective disentanglement between geometry and appearance. Combined with a self-supervised part classification loss and an improved geometry concentration constraint, we can segment more consistent parts with semantic meanings. Comprehensive experiments on a wide variety of objects such as face, bird, and PASCAL VOC objects demonstrate the effectiveness of the proposed method.

* Accepted in CVPR 2021

Via

Access Paper or Ask Questions

Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization

May 10, 2021

Guoqiang Wu, Chongxuan Li, Kun Xu, Jun Zhu

Figure 1 for Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization

Figure 2 for Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization

Figure 3 for Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization

Figure 4 for Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization

Abstract:(Partial) ranking loss is a commonly used evaluation measure for multi-label classification, which is usually optimized with convex surrogates for computational efficiency. Prior theoretical work on multi-label ranking mainly focuses on (Fisher) consistency analyses. However, there is a gap between existing theory and practice -- some pairwise losses can lead to promising performance but lack consistency, while some univariate losses are consistent but usually have no clear superiority in practice. In this paper, we attempt to fill this gap through a systematic study from two complementary perspectives of consistency and generalization error bounds of learning algorithms. Our results show that learning algorithms with the consistent univariate loss have an error bound of $O(c)$ ($c$ is the number of labels), while algorithms with the inconsistent pairwise loss depend on $O(\sqrt{c})$ as shown in prior work. This explains that the latter can achieve better performance than the former in practice. Moreover, we present an inconsistent reweighted univariate loss-based learning algorithm that enjoys an error bound of $O(\sqrt{c})$ for promising performance as well as the computational efficiency of univariate losses. Finally, experimental results validate our theoretical analyses.

Via

Access Paper or Ask Questions

Automated Decision-based Adversarial Attacks

May 09, 2021

Qi-An Fu, Yinpeng Dong, Hang Su, Jun Zhu

Figure 1 for Automated Decision-based Adversarial Attacks

Figure 2 for Automated Decision-based Adversarial Attacks

Figure 3 for Automated Decision-based Adversarial Attacks

Figure 4 for Automated Decision-based Adversarial Attacks

Abstract:Deep learning models are vulnerable to adversarial examples, which can fool a target classifier by imposing imperceptible perturbations onto natural examples. In this work, we consider the practical and challenging decision-based black-box adversarial setting, where the attacker can only acquire the final classification labels by querying the target model without access to the model's details. Under this setting, existing works often rely on heuristics and exhibit unsatisfactory performance. To better understand the rationality of these heuristics and the limitations of existing methods, we propose to automatically discover decision-based adversarial attack algorithms. In our approach, we construct a search space using basic mathematical operations as building blocks and develop a random search algorithm to efficiently explore this space by incorporating several pruning techniques and intuitive priors inspired by program synthesis works. Although we use a small and fast model to efficiently evaluate attack algorithms during the search, extensive experiments demonstrate that the discovered algorithms are simple yet query-efficient when transferred to larger normal and defensive models on the CIFAR-10 and ImageNet datasets. They achieve comparable or better performance than the state-of-the-art decision-based attack methods consistently.

* 16 pages, 6 figures

Via

Access Paper or Ask Questions

MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering

May 05, 2021

Tsung Wei Tsai, Chongxuan Li, Jun Zhu

Figure 1 for MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering

Figure 2 for MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering

Figure 3 for MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering

Figure 4 for MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering

Abstract:We present Mixture of Contrastive Experts (MiCE), a unified probabilistic clustering framework that simultaneously exploits the discriminative representations learned by contrastive learning and the semantic structures captured by a latent mixture model. Motivated by the mixture of experts, MiCE employs a gating function to partition an unlabeled dataset into subsets according to the latent semantics and multiple experts to discriminate distinct subsets of instances assigned to them in a contrastive learning manner. To solve the nontrivial inference and learning problems caused by the latent variables, we further develop a scalable variant of the Expectation-Maximization (EM) algorithm for MiCE and provide proof of the convergence. Empirically, we evaluate the clustering performance of MiCE on four widely adopted natural image datasets. MiCE achieves significantly better results than various previous methods and a strong contrastive learning baseline.

* International Conference on Learning Representations (ICLR) 2021

Via

Access Paper or Ask Questions

Few-shot Continual Learning: a Brain-inspired Approach

Apr 19, 2021

Liyuan Wang, Qian Li, Yi Zhong, Jun Zhu

Figure 1 for Few-shot Continual Learning: a Brain-inspired Approach

Figure 2 for Few-shot Continual Learning: a Brain-inspired Approach

Figure 3 for Few-shot Continual Learning: a Brain-inspired Approach

Figure 4 for Few-shot Continual Learning: a Brain-inspired Approach

Abstract:It is an important yet challenging setting to continually learn new tasks from a few examples. Although numerous efforts have been devoted to either continual learning or few-shot learning, little work has considered this new setting of few-shot continual learning (FSCL), which needs to minimize the catastrophic forgetting to the old tasks and gradually improve the ability of few-shot generalization. In this paper, we provide a first systematic study on FSCL and present an effective solution with deep neural networks. Our solution is based on the observation that continual learning of a task sequence inevitably interferes few-shot generalization, which makes it highly nontrivial to extend few-shot learning strategies to continual learning scenarios. We draw inspirations from the robust brain system and develop a method that (1) interdependently updates a pair of fast / slow weights for continual learning and few-shot learning to disentangle their divergent objectives, inspired by the biological model of meta-plasticity and fast / slow synapse; and (2) applies a brain-inspired two-step consolidation strategy to learn a task sequence without forgetting in the fast weights while improve generalization without overfitting in the slow weights. Extensive results on various benchmarks show that our method achieves a better performance than joint training of all the tasks ever seen. The ability of few-shot generalization is also substantially improved from incoming tasks and examples.

Via

Access Paper or Ask Questions

Counter-Strike Deathmatch with Large-Scale Behavioural Cloning

Apr 09, 2021

Tim Pearce, Jun Zhu

Figure 1 for Counter-Strike Deathmatch with Large-Scale Behavioural Cloning

Figure 2 for Counter-Strike Deathmatch with Large-Scale Behavioural Cloning

Figure 3 for Counter-Strike Deathmatch with Large-Scale Behavioural Cloning

Figure 4 for Counter-Strike Deathmatch with Large-Scale Behavioural Cloning

Abstract:This paper describes an AI agent that plays the popular first-person-shooter (FPS) video game `Counter-Strike; Global Offensive' (CSGO) from pixel input. The agent, a deep neural network, matches the performance of the medium difficulty built-in AI on the deathmatch game mode, whilst adopting a humanlike play style. Unlike much prior work in games, no API is available for CSGO, so algorithms must train and run in real-time. This limits the quantity of on-policy data that can be generated, precluding many reinforcement learning algorithms. Our solution uses behavioural cloning - training on a large noisy dataset scraped from human play on online servers (4 million frames, comparable in size to ImageNet), and a smaller dataset of high-quality expert demonstrations. This scale is an order of magnitude larger than prior work on imitation learning in FPS games.

Via

Access Paper or Ask Questions

Accurate and Reliable Forecasting using Stochastic Differential Equations

Mar 28, 2021

Peng Cui, Zhijie Deng, Wenbo Hu, Jun Zhu

Figure 1 for Accurate and Reliable Forecasting using Stochastic Differential Equations

Figure 2 for Accurate and Reliable Forecasting using Stochastic Differential Equations

Figure 3 for Accurate and Reliable Forecasting using Stochastic Differential Equations

Figure 4 for Accurate and Reliable Forecasting using Stochastic Differential Equations

Abstract:It is critical yet challenging for deep learning models to properly characterize uncertainty that is pervasive in real-world environments. Although a lot of efforts have been made, such as heteroscedastic neural networks (HNNs), little work has demonstrated satisfactory practicability due to the different levels of compromise on learning efficiency, quality of uncertainty estimates, and predictive performance. Moreover, existing HNNs typically fail to construct an explicit interaction between the prediction and its associated uncertainty. This paper aims to remedy these issues by developing SDE-HNN, a new heteroscedastic neural network equipped with stochastic differential equations (SDE) to characterize the interaction between the predictive mean and variance of HNNs for accurate and reliable regression. Theoretically, we show the existence and uniqueness of the solution to the devised neural SDE. Moreover, based on the bias-variance trade-off for the optimization in SDE-HNN, we design an enhanced numerical SDE solver to improve the learning stability. Finally, to more systematically evaluate the predictive uncertainty, we present two new diagnostic uncertainty metrics. Experiments on the challenging datasets show that our method significantly outperforms the state-of-the-art baselines in terms of both predictive performance and uncertainty quantification, delivering well-calibrated and sharp prediction intervals.

Via

Access Paper or Ask Questions

LiBRe: A Practical Bayesian Approach to Adversarial Detection

Mar 27, 2021

Zhijie Deng, Xiao Yang, Shizhen Xu, Hang Su, Jun Zhu

Figure 1 for LiBRe: A Practical Bayesian Approach to Adversarial Detection

Figure 2 for LiBRe: A Practical Bayesian Approach to Adversarial Detection

Figure 3 for LiBRe: A Practical Bayesian Approach to Adversarial Detection

Figure 4 for LiBRe: A Practical Bayesian Approach to Adversarial Detection

Abstract:Despite their appealing flexibility, deep neural networks (DNNs) are vulnerable against adversarial examples. Various adversarial defense strategies have been proposed to resolve this problem, but they typically demonstrate restricted practicability owing to unsurmountable compromise on universality, effectiveness, or efficiency. In this work, we propose a more practical approach, Lightweight Bayesian Refinement (LiBRe), in the spirit of leveraging Bayesian neural networks (BNNs) for adversarial detection. Empowered by the task and attack agnostic modeling under Bayes principle, LiBRe can endow a variety of pre-trained task-dependent DNNs with the ability of defending heterogeneous adversarial attacks at a low cost. We develop and integrate advanced learning techniques to make LiBRe appropriate for adversarial detection. Concretely, we build the few-layer deep ensemble variational and adopt the pre-training & fine-tuning workflow to boost the effectiveness and efficiency of LiBRe. We further provide a novel insight to realise adversarial detection-oriented uncertainty quantification without inefficiently crafting adversarial examples during training. Extensive empirical studies covering a wide range of scenarios verify the practicability of LiBRe. We also conduct thorough ablation studies to evidence the superiority of our modeling and learning strategies.

* IEEE/ CVF International Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Via

Access Paper or Ask Questions