Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jun Zhu

FutureFab.AI

Score Regularized Policy Optimization through Diffusion Behavior

Oct 12, 2023

Huayu Chen, Cheng Lu, Zhengyi Wang, Hang Su, Jun Zhu

Abstract:Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow because it necessitates tens to hundreds of iterative inference steps for one action. To address this issue, we propose to extract an efficient deterministic inference policy from critic models and pretrained diffusion behavior models, leveraging the latter to directly regularize the policy gradient with the behavior distribution's score function during optimization. Our method enjoys powerful generative capabilities of diffusion modeling while completely circumventing the computationally intensive and time-consuming diffusion sampling scheme, both during training and evaluation. Extensive results on D4RL tasks show that our method boosts action sampling speed by more than 25 times compared with various leading diffusion-based methods in locomotion tasks, while still maintaining state-of-the-art performance.

* 18 pages

Via

Access Paper or Ask Questions

Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality

Oct 11, 2023

Liyuan Wang, Jingyi Xie, Xingxing Zhang, Mingyi Huang, Hang Su, Jun Zhu

Figure 1 for Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality

Figure 2 for Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality

Figure 3 for Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality

Figure 4 for Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality

Abstract:Prompt-based continual learning is an emerging direction in leveraging pre-trained knowledge for downstream continual learning, and has almost reached the performance pinnacle under supervised pre-training. However, our empirical research reveals that the current strategies fall short of their full potential under the more realistic self-supervised pre-training, which is essential for handling vast quantities of unlabeled data in practice. This is largely due to the difficulty of task-specific knowledge being incorporated into instructed representations via prompt parameters and predicted by uninstructed representations at test time. To overcome the exposed sub-optimality, we conduct a theoretical analysis of the continual learning objective in the context of pre-training, and decompose it into hierarchical components: within-task prediction, task-identity inference, and task-adaptive prediction. Following these empirical and theoretical insights, we propose Hierarchical Decomposition (HiDe-)Prompt, an innovative approach that explicitly optimizes the hierarchical components with an ensemble of task-specific prompts and statistics of both uninstructed and instructed representations, further with the coordination of a contrastive regularization strategy. Our extensive experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning (e.g., up to 15.01% and 9.61% lead on Split CIFAR-100 and Split ImageNet-R, respectively). Our code is available at \url{https://github.com/thu-ml/HiDe-Prompt}.

* 23 pages, 20 figures, 11 tables, accepted by NeurIPS as a Spotlight

Via

Access Paper or Ask Questions

How Robust is Google's Bard to Adversarial Image Attacks?

Sep 21, 2023

Yinpeng Dong, Huanran Chen, Jiawei Chen, Zhengwei Fang, Xiao Yang, Yichi Zhang, Yu Tian, Hang Su, Jun Zhu

Figure 1 for How Robust is Google's Bard to Adversarial Image Attacks?

Figure 2 for How Robust is Google's Bard to Adversarial Image Attacks?

Figure 3 for How Robust is Google's Bard to Adversarial Image Attacks?

Figure 4 for How Robust is Google's Bard to Adversarial Image Attacks?

Abstract:Multimodal Large Language Models (MLLMs) that integrate text and other modalities (especially vision) have achieved unprecedented performance in various multimodal tasks. However, due to the unsolved adversarial robustness problem of vision models, MLLMs can have more severe safety and security risks by introducing the vision inputs. In this work, we study the adversarial robustness of Google's Bard, a competitive chatbot to ChatGPT that released its multimodal capability recently, to better understand the vulnerabilities of commercial MLLMs. By attacking white-box surrogate vision encoders or MLLMs, the generated adversarial examples can mislead Bard to output wrong image descriptions with a 22% success rate based solely on the transferability. We show that the adversarial examples can also attack other MLLMs, e.g., a 26% attack success rate against Bing Chat and a 86% attack success rate against ERNIE bot. Moreover, we identify two defense mechanisms of Bard, including face detection and toxicity detection of images. We design corresponding attacks to evade these defenses, demonstrating that the current defenses of Bard are also vulnerable. We hope this work can deepen our understanding on the robustness of MLLMs and facilitate future research on defenses. Our code is available at https://github.com/thu-ml/Attack-Bard.

* Technical report

Via

Access Paper or Ask Questions

i-Octree: A Fast, Lightweight, and Dynamic Octree for Proximity Search

Sep 15, 2023

Jun Zhu, Hongyi Li, Shengjie Wang, Zhepeng Wang, Tao Zhang

Figure 1 for i-Octree: A Fast, Lightweight, and Dynamic Octree for Proximity Search

Figure 2 for i-Octree: A Fast, Lightweight, and Dynamic Octree for Proximity Search

Figure 3 for i-Octree: A Fast, Lightweight, and Dynamic Octree for Proximity Search

Figure 4 for i-Octree: A Fast, Lightweight, and Dynamic Octree for Proximity Search

Abstract:Establishing the correspondences between newly acquired points and historically accumulated data (i.e., map) through nearest neighbors search is crucial in numerous robotic applications.However, static tree data structures are inadequate to handle large and dynamically growing maps in real-time.To address this issue, we present the i-Octree, a dynamic octree data structure that supports both fast nearest neighbor search and real-time dynamic updates, such as point insertion, deletion, and on-tree down-sampling. The i-Octree is built upon a leaf-based octree and has two key features: a local spatially continuous storing strategy that allows for fast access to points while minimizing memory usage, and local on-tree updates that significantly reduce computation time compared to existing static or dynamic tree structures.The experiments show that i-Octree surpasses state-of-the-art methods by reducing run-time by over 50% on real-world open datasets.

* 7 pages, 7 figures

Via

Access Paper or Ask Questions

Memory Efficient Optimizers with 4-bit States

Sep 06, 2023

Bingrui Li, Jianfei Chen, Jun Zhu

Figure 1 for Memory Efficient Optimizers with 4-bit States

Figure 2 for Memory Efficient Optimizers with 4-bit States

Figure 3 for Memory Efficient Optimizers with 4-bit States

Figure 4 for Memory Efficient Optimizers with 4-bit States

Abstract:Optimizer states are a major source of memory consumption for training neural networks, limiting the maximum trainable model within given memory budget. Compressing the optimizer states from 32-bit floating points to lower bitwidth is promising to reduce the training memory footprint, while the current lowest achievable bitwidth is 8-bit. In this work, we push optimizer states bitwidth down to 4-bit through a detailed empirical analysis of first and second moments. Specifically, we find that moments have complicated outlier patterns, that current block-wise quantization cannot accurately approximate. We use a smaller block size and propose to utilize both row-wise and column-wise information for better quantization. We further identify a zero point problem of quantizing the second moment, and solve this problem with a linear quantizer that excludes the zero point. Our 4-bit optimizer is evaluated on a wide variety of benchmarks including natural language understanding, machine translation, image classification, and instruction tuning. On all the tasks our optimizers can achieve comparable accuracy with their full-precision counterparts, while enjoying better memory efficiency.

* 35 pages

Via

Access Paper or Ask Questions

SeisCLIP: A seismology foundation model pre-trained by multi-modal data for multi-purpose seismic feature extraction

Sep 05, 2023

Xu Si, Xinming Wu, Hanlin Sheng, Jun Zhu, Zefeng Li

Figure 1 for SeisCLIP: A seismology foundation model pre-trained by multi-modal data for multi-purpose seismic feature extraction

Figure 2 for SeisCLIP: A seismology foundation model pre-trained by multi-modal data for multi-purpose seismic feature extraction

Figure 3 for SeisCLIP: A seismology foundation model pre-trained by multi-modal data for multi-purpose seismic feature extraction

Figure 4 for SeisCLIP: A seismology foundation model pre-trained by multi-modal data for multi-purpose seismic feature extraction

Abstract:Training specific deep learning models for particular tasks is common across various domains within seismology. However, this approach encounters two limitations: inadequate labeled data for certain tasks and limited generalization across regions. To address these challenges, we develop SeisCLIP, a seismology foundation model trained through contrastive learning from multi-modal data. It consists of a transformer encoder for extracting crucial features from time-frequency seismic spectrum and an MLP encoder for integrating the phase and source information of the same event. These encoders are jointly pre-trained on a vast dataset and the spectrum encoder is subsequently fine-tuned on smaller datasets for various downstream tasks. Notably, SeisCLIP's performance surpasses that of baseline methods in event classification, localization, and focal mechanism analysis tasks, employing distinct datasets from different regions. In conclusion, SeisCLIP holds significant potential as a foundational model in the field of seismology, paving the way for innovative directions in foundation-model-based seismology research.

* 27 pages, 9 figures, 4 tables

Via

Access Paper or Ask Questions

Incorporating Neuro-Inspired Adaptability for Continual Learning in Artificial Intelligence

Aug 29, 2023

Liyuan Wang, Xingxing Zhang, Qian Li, Mingtian Zhang, Hang Su, Jun Zhu, Yi Zhong

Figure 1 for Incorporating Neuro-Inspired Adaptability for Continual Learning in Artificial Intelligence

Figure 2 for Incorporating Neuro-Inspired Adaptability for Continual Learning in Artificial Intelligence

Figure 3 for Incorporating Neuro-Inspired Adaptability for Continual Learning in Artificial Intelligence

Figure 4 for Incorporating Neuro-Inspired Adaptability for Continual Learning in Artificial Intelligence

Abstract:Continual learning aims to empower artificial intelligence (AI) with strong adaptability to the real world. For this purpose, a desirable solution should properly balance memory stability with learning plasticity, and acquire sufficient compatibility to capture the observed distributions. Existing advances mainly focus on preserving memory stability to overcome catastrophic forgetting, but remain difficult to flexibly accommodate incremental changes as biological intelligence (BI) does. By modeling a robust Drosophila learning system that actively regulates forgetting with multiple learning modules, here we propose a generic approach that appropriately attenuates old memories in parameter distributions to improve learning plasticity, and accordingly coordinates a multi-learner architecture to ensure solution compatibility. Through extensive theoretical and empirical validation, our approach not only clearly enhances the performance of continual learning, especially over synaptic regularization methods in task-incremental settings, but also potentially advances the understanding of neurological adaptive mechanisms, serving as a novel paradigm to progress AI and BI together.

Via

Access Paper or Ask Questions

Heterogeneous Multi-Task Gaussian Cox Processes

Aug 29, 2023

Feng Zhou, Quyu Kong, Zhijie Deng, Fengxiang He, Peng Cui, Jun Zhu

Abstract:This paper presents a novel extension of multi-task Gaussian Cox processes for modeling multiple heterogeneous correlated tasks jointly, e.g., classification and regression, via multi-output Gaussian processes (MOGP). A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks, while allowing for nonparametric parameter estimation. To circumvent the non-conjugate Bayesian inference in the MOGP modulated heterogeneous multi-task framework, we employ the data augmentation technique and derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters. We demonstrate the performance and inference on both 1D synthetic data as well as 2D urban data of Vancouver.

Via

Access Paper or Ask Questions

Towards Accelerated Model Training via Bayesian Data Selection

Aug 21, 2023

Zhijie Deng, Peng Cui, Jun Zhu

Abstract:Mislabeled, duplicated, or biased data in real-world scenarios can lead to prolonged training and even hinder model convergence. Traditional solutions prioritizing easy or hard samples lack the flexibility to handle such a variety simultaneously. Recent work has proposed a more reasonable data selection principle by examining the data's impact on the model's generalization loss. However, its practical adoption relies on less principled approximations and additional clean holdout data. This work solves these problems by leveraging a lightweight Bayesian treatment and incorporating off-the-shelf zero-shot predictors built on large-scale pre-trained models. The resulting algorithm is efficient and easy-to-implement. We perform extensive empirical studies on challenging benchmarks with considerable data noise and imbalance in the online batch selection scenario, and observe superior training efficiency over competitive baselines. Notably, on the challenging WebVision benchmark, our method can achieve similar predictive performance with significantly fewer training iterations than leading data selection methods.

Via

Access Paper or Ask Questions

6G Network Business Support System

Jul 19, 2023

Ye Ouyang, Yaqin Zhang, Peng Wang, Yunxin Liu, Wen Qiao, Jun Zhu, Yang Liu, Feng Zhang, Shuling Wang, Xidong Wang

Figure 1 for 6G Network Business Support System

Figure 2 for 6G Network Business Support System

Figure 3 for 6G Network Business Support System

Figure 4 for 6G Network Business Support System

Abstract:6G is the next-generation intelligent and integrated digital information infrastructure, characterized by ubiquitous interconnection, native intelligence, multi-dimensional perception, global coverage, green and low-carbon, native network security, etc. 6G will realize the transition from serving people and people-things communication to supporting the efficient connection of intelligent agents, and comprehensively leading the digital, intelligent and green transformation of the economy and the society. As the core support system for mobile communication network, 6 6G BSS need to integrate with new business models brought about by the development of the next-generation Internet and IT, upgrade from "network-centric" to "business and service centric" and "customer-centric". 6G OSS and BSS systems need to strengthen their integration to improve the operational efficiency and benefits of customers by connecting the digital intelligence support capabilities on both sides of supply and demand. This paper provides a detailed introduction to the overall vision, potential key technologies, and functional architecture of 6G BSS systems. It also presents an evolutionary roadmap and technological prospects for the BSS systems from 5G to 6G.

Via

Access Paper or Ask Questions