Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jun Zhu

Tsinghua University

Neural Eigenfunctions Are Structured Representation Learners

Oct 23, 2022

Zhijie Deng, Jiaxin Shi, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu

Abstract:In this paper, we introduce a scalable method for learning structured, adaptive-length deep representations. Our approach is to train neural networks such that they approximate the principal eigenfunctions of a kernel. We show that, when the kernel is derived from positive relations in a contrastive learning setup, our method outperforms a number of competitive baselines in visual representation learning and transfer learning benchmarks, and importantly, produces structured representations where the order of features indicates degrees of importance. We demonstrate using such representations as adaptive-length codes in image retrieval systems. By truncation according to feature importance, our method requires up to 16$\times$ shorter representation length than leading self-supervised learning methods to achieve similar retrieval performance. We further apply our method to graph data and report strong results on a node representation learning benchmark with more than one million nodes.

Via

Access Paper or Ask Questions

A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs

Oct 10, 2022

Songming Liu, Zhongkai Hao, Chengyang Ying, Hang Su, Jun Zhu, Ze Cheng

Figure 1 for A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs

Figure 2 for A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs

Figure 3 for A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs

Figure 4 for A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs

Abstract:We present a unified hard-constraint framework for solving geometrically complex PDEs with neural networks, where the most commonly used Dirichlet, Neumann, and Robin boundary conditions (BCs) are considered. Specifically, we first introduce the "extra fields" from the mixed finite element method to reformulate the PDEs so as to equivalently transform the three types of BCs into linear forms. Based on the reformulation, we derive the general solutions of the BCs analytically, which are employed to construct an ansatz that automatically satisfies the BCs. With such a framework, we can train the neural networks without adding extra loss terms and thus efficiently handle geometrically complex PDEs, alleviating the unbalanced competition between the loss terms corresponding to the BCs and PDEs. We theoretically demonstrate that the "extra fields" can stabilize the training process. Experimental results on real-world geometrically complex PDEs showcase the effectiveness of our method compared with state-of-the-art baselines.

* 10 pages, 5 figures, NeurIPS 2022

Via

Access Paper or Ask Questions

ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints

Oct 08, 2022

Yinpeng Dong, Shouwei Ruan, Hang Su, Caixin Kang, Xingxing Wei, Jun Zhu

Figure 1 for ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints

Figure 2 for ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints

Figure 3 for ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints

Figure 4 for ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints

Abstract:Recent studies have demonstrated that visual recognition models lack robustness to distribution shift. However, current work mainly considers model robustness to 2D image transformations, leaving viewpoint changes in the 3D world less explored. In general, viewpoint changes are prevalent in various real-world applications (e.g., autonomous driving), making it imperative to evaluate viewpoint robustness. In this paper, we propose a novel method called ViewFool to find adversarial viewpoints that mislead visual recognition models. By encoding real-world objects as neural radiance fields (NeRF), ViewFool characterizes a distribution of diverse adversarial viewpoints under an entropic regularizer, which helps to handle the fluctuations of the real camera pose and mitigate the reality gap between the real objects and their neural representations. Experiments validate that the common image classifiers are extremely vulnerable to the generated adversarial viewpoints, which also exhibit high cross-model transferability. Based on ViewFool, we introduce ImageNet-V, a new out-of-distribution dataset for benchmarking viewpoint robustness of image classifiers. Evaluation results on 40 classifiers with diverse architectures, objective functions, and data augmentations reveal a significant drop in model performance when tested on ImageNet-V, which provides a possibility to leverage ViewFool as an effective data augmentation strategy to improve viewpoint robustness.

* NeurIPS 2022

Via

Access Paper or Ask Questions

Equivariant Energy-Guided SDE for Inverse Molecular Design

Sep 30, 2022

Fan Bao, Min Zhao, Zhongkai Hao, Peiyao Li, Chongxuan Li, Jun Zhu

Figure 1 for Equivariant Energy-Guided SDE for Inverse Molecular Design

Figure 2 for Equivariant Energy-Guided SDE for Inverse Molecular Design

Figure 3 for Equivariant Energy-Guided SDE for Inverse Molecular Design

Figure 4 for Equivariant Energy-Guided SDE for Inverse Molecular Design

Abstract:Inverse molecular design is critical in material science and drug discovery, where the generated molecules should satisfy certain desirable properties. In this paper, we propose equivariant energy-guided stochastic differential equations (EEGSDE), a flexible framework for controllable 3D molecule generation under the guidance of an energy function in diffusion models. Formally, we show that EEGSDE naturally exploits the geometric symmetry in 3D molecular conformation, as long as the energy function is invariant to orthogonal transformations. Empirically, under the guidance of designed energy functions, EEGSDE significantly improves the baseline on QM9, in inverse molecular design targeted to quantum properties and molecular structures. Furthermore, EEGSDE is able to generate molecules with multiple target properties by combining the corresponding energy functions linearly.

Via

Access Paper or Ask Questions

INT: Towards Infinite-frames 3D Detection with An Efficient Framework

Sep 30, 2022

Jianyun Xu, Zhenwei Miao, Da Zhang, Hongyu Pan, Kaixuan Liu, Peihan Hao, Jun Zhu, Zhengyang Sun, Hongmin Li, Xin Zhan

Figure 1 for INT: Towards Infinite-frames 3D Detection with An Efficient Framework

Figure 2 for INT: Towards Infinite-frames 3D Detection with An Efficient Framework

Figure 3 for INT: Towards Infinite-frames 3D Detection with An Efficient Framework

Figure 4 for INT: Towards Infinite-frames 3D Detection with An Efficient Framework

Abstract:It is natural to construct a multi-frame instead of a single-frame 3D detector for a continuous-time stream. Although increasing the number of frames might improve performance, previous multi-frame studies only used very limited frames to build their systems due to the dramatically increased computational and memory cost. To address these issues, we propose a novel on-stream training and prediction framework that, in theory, can employ an infinite number of frames while keeping the same amount of computation as a single-frame detector. This infinite framework (INT), which can be used with most existing detectors, is utilized, for example, on the popular CenterPoint, with significant latency reductions and performance improvements. We've also conducted extensive experiments on two large-scale datasets, nuScenes and Waymo Open Dataset, to demonstrate the scheme's effectiveness and efficiency. By employing INT on CenterPoint, we can get around 7% (Waymo) and 15% (nuScenes) performance boost with only 2~4ms latency overhead, and currently SOTA on the Waymo 3D Detection leaderboard.

* accepted by ECCV2022

Via

Access Paper or Ask Questions

Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

Sep 29, 2022

Huayu Chen, Cheng Lu, Chengyang Ying, Hang Su, Jun Zhu

Figure 1 for Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

Figure 2 for Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

Figure 3 for Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

Figure 4 for Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

Abstract:In offline reinforcement learning, weighted regression is a common method to ensure the learned policy stays close to the behavior policy and to prevent selecting out-of-sample actions. In this work, we show that due to the limited distributional expressivity of policy models, previous methods might still select unseen actions during training, which deviates from their initial motivation. To address this problem, we adopt a generative approach by decoupling the learned policy into two parts: an expressive generative behavior model and an action evaluation model. The key insight is that such decoupling avoids learning an explicitly parameterized policy model with a closed-form expression. Directly learning the behavior policy allows us to leverage existing advances in generative modeling, such as diffusion-based methods, to model diverse behaviors. As for action evaluation, we combine our method with an in-sample planning technique to further avoid selecting out-of-sample actions and increase computational efficiency. Experimental results on D4RL datasets show that our proposed method achieves competitive or superior performance compared with state-of-the-art offline RL methods, especially in complex tasks such as AntMaze. We also empirically demonstrate that our method can successfully learn from a heterogeneous dataset containing multiple distinctive but similarly successful strategies, whereas previous unimodal policies fail.

Via

Access Paper or Ask Questions

All are Worth Words: a ViT Backbone for Score-based Diffusion Models

Sep 25, 2022

Fan Bao, Chongxuan Li, Yue Cao, Jun Zhu

Figure 1 for All are Worth Words: a ViT Backbone for Score-based Diffusion Models

Figure 2 for All are Worth Words: a ViT Backbone for Score-based Diffusion Models

Figure 3 for All are Worth Words: a ViT Backbone for Score-based Diffusion Models

Figure 4 for All are Worth Words: a ViT Backbone for Score-based Diffusion Models

Abstract:Vision transformers (ViT) have shown promise in various vision tasks including low-level ones while the U-Net remains dominant in score-based diffusion models. In this paper, we perform a systematical empirical study on the ViT-based architectures in diffusion models. Our results suggest that adding extra long skip connections (like the U-Net) to ViT is crucial to diffusion models. The new ViT architecture, together with other improvements, is referred to as U-ViT. On several popular visual datasets, U-ViT achieves competitive generation results to SOTA U-Net while requiring comparable amount of parameters and computation if not less.

Via

Access Paper or Ask Questions

Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients

Sep 15, 2022

Zhongkai Hao, Chengyang Ying, Hang Su, Jun Zhu, Jian Song, Ze Cheng

Figure 1 for Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients

Figure 2 for Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients

Figure 3 for Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients

Figure 4 for Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients

Abstract:Deep learning based approaches like Physics-informed neural networks (PINNs) and DeepONets have shown promise on solving PDE constrained optimization (PDECO) problems. However, existing methods are insufficient to handle those PDE constraints that have a complicated or nonlinear dependency on optimization targets. In this paper, we present a novel bi-level optimization framework to resolve the challenge by decoupling the optimization of the targets and constraints. For the inner loop optimization, we adopt PINNs to solve the PDE constraints only. For the outer loop, we design a novel method by using Broyden's method based on the Implicit Function Theorem (IFT), which is efficient and accurate for approximating hypergradients. We further present theoretical explanations and error analysis of the hypergradients computation. Extensive experiments on multiple large-scale and nonlinear PDE constrained optimization problems demonstrate that our method achieves state-of-the-art results compared with strong baselines.

Via

Access Paper or Ask Questions

On the Reuse Bias in Off-Policy Reinforcement Learning

Sep 15, 2022

Chengyang Ying, Zhongkai Hao, Xinning Zhou, Hang Su, Dong Yan, Jun Zhu

Figure 1 for On the Reuse Bias in Off-Policy Reinforcement Learning

Figure 2 for On the Reuse Bias in Off-Policy Reinforcement Learning

Figure 3 for On the Reuse Bias in Off-Policy Reinforcement Learning

Figure 4 for On the Reuse Bias in Off-Policy Reinforcement Learning

Abstract:Importance sampling (IS) is a popular technique in off-policy evaluation, which re-weights the return of trajectories in the replay buffer to boost sample efficiency. However, training with IS can be unstable and previous attempts to address this issue mainly focus on analyzing the variance of IS. In this paper, we reveal that the instability is also related to a new notion of Reuse Bias of IS -- the bias in off-policy evaluation caused by the reuse of the replay buffer for evaluation and optimization. We theoretically show that the off-policy evaluation and optimization of the current policy with the data from the replay buffer result in an overestimation of the objective, which may cause an erroneous gradient update and degenerate the performance. We further provide a high-probability upper bound of the Reuse Bias, and show that controlling one term of the upper bound can control the Reuse Bias by introducing the concept of stability for off-policy algorithms. Based on these analyses, we finally present a novel Bias-Regularized Importance Sampling (BIRIS) framework along with practical algorithms, which can alleviate the negative impact of the Reuse Bias. Experimental results show that our BIRIS-based methods can significantly improve the sample efficiency on a series of continuous control tasks in MuJoCo.

Via

Access Paper or Ask Questions

Regret Analysis for Hierarchical Experts Bandit Problem

Aug 11, 2022

Qihan Guo, Siwei Wang, Jun Zhu

Figure 1 for Regret Analysis for Hierarchical Experts Bandit Problem

Figure 2 for Regret Analysis for Hierarchical Experts Bandit Problem

Figure 3 for Regret Analysis for Hierarchical Experts Bandit Problem

Figure 4 for Regret Analysis for Hierarchical Experts Bandit Problem

Abstract:We study an extension of standard bandit problem in which there are R layers of experts. Multi-layered experts make selections layer by layer and only the experts in the last layer can play arms. The goal of the learning policy is to minimize the total regret in this hierarchical experts setting. We first analyze the case that total regret grows linearly with the number of layers. Then we focus on the case that all experts are playing Upper Confidence Bound (UCB) strategy and give several sub-linear upper bounds for different circumstances. Finally, we design some experiments to help the regret analysis for the general case of hierarchical UCB structure and show the practical significance of our theoretical results. This article gives many insights about reasonable hierarchical decision structure.

* 14 pages, 2 figures, submitted to AAAI 2023

Via

Access Paper or Ask Questions