Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhen Liu

Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer

Jan 17, 2024

Junhao Zheng, Qianli Ma, Zhen Liu, Binquan Wu, Huawen Feng

Abstract:Multimodal Continual Instruction Tuning (MCIT) enables Multimodal Large Language Models (MLLMs) to meet continuously emerging requirements without expensive retraining. MCIT faces two major obstacles: catastrophic forgetting (where old knowledge is forgotten) and negative forward transfer (where the performance of future tasks is degraded). Although existing methods have greatly alleviated catastrophic forgetting, they still suffer from negative forward transfer. By performing singular value decomposition (SVD) on input embeddings, we discover a large discrepancy in different input embeddings. The discrepancy results in the model learning irrelevant information for old and pre-trained tasks, which leads to catastrophic forgetting and negative forward transfer. To address these issues, we propose Fwd-Prompt, a prompt-based method projecting prompt gradient to the residual space to minimize the interference between tasks and to the pre-trained subspace for reusing pre-trained knowledge. Our experiments demonstrate that Fwd-Prompt achieves state-of-the-art performance while updating fewer parameters and requiring no old samples. Our research sheds light on the potential of continuously adapting MLLMs to new tasks under the instruction tuning paradigm and encourages future studies to explore MCIT. The code will soon be publicly available.

Via

Access Paper or Ask Questions

FINER: Flexible spectral-bias tuning in Implicit NEural Representation by Variable-periodic Activation Functions

Dec 05, 2023

Zhen Liu, Hao Zhu, Qi Zhang, Jingde Fu, Weibing Deng, Zhan Ma, Yanwen Guo, Xun Cao

Abstract:Implicit Neural Representation (INR), which utilizes a neural network to map coordinate inputs to corresponding attributes, is causing a revolution in the field of signal processing. However, current INR techniques suffer from a restricted capability to tune their supported frequency set, resulting in imperfect performance when representing complex signals with multiple frequencies. We have identified that this frequency-related problem can be greatly alleviated by introducing variable-periodic activation functions, for which we propose FINER. By initializing the bias of the neural network within different ranges, sub-functions with various frequencies in the variable-periodic function are selected for activation. Consequently, the supported frequency set of FINER can be flexibly tuned, leading to improved performance in signal representation. We demonstrate the capabilities of FINER in the contexts of 2D image fitting, 3D signed distance field representation, and 5D neural radiance fields optimization, and we show that it outperforms existing INRs.

* 10 pages, 9 figures

Via

Access Paper or Ask Questions

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Nov 10, 2023

Weiyang Liu, Zeju Qiu, Yao Feng, Yuliang Xiu, Yuxuan Xue, Longhui Yu, Haiwen Feng, Zhen Liu, Juyeon Heo, Songyou Peng(+4 more)

Figure 1 for Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Figure 2 for Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Figure 3 for Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Figure 4 for Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Abstract:Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a few key desiderata that enable better parameter-efficiency. Inspired by how the Cooley-Tukey fast Fourier transform algorithm enables efficient information transmission, we propose an efficient orthogonal parameterization using butterfly structures. We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT). By subsuming OFT as a special case, BOFT introduces a generalized orthogonal finetuning framework. Finally, we conduct an extensive empirical study of adapting large vision transformers, large language models, and text-to-image diffusion models to various downstream tasks in vision and language.

* Technical Report (33 pages, 18 figures)

Via

Access Paper or Ask Questions

Ghost on the Shell: An Expressive Representation of General 3D Shapes

Oct 24, 2023

Zhen Liu, Yao Feng, Yuliang Xiu, Weiyang Liu, Liam Paull, Michael J. Black, Bernhard Schölkopf

Abstract:The creation of photorealistic virtual worlds requires the accurate modeling of 3D surface geometry for a wide range of objects. For this, meshes are appealing since they 1) enable fast physics-based rendering with realistic material and lighting, 2) support physical simulation, and 3) are memory-efficient for modern graphics pipelines. Recent work on reconstructing and statistically modeling 3D shape, however, has critiqued meshes as being topologically inflexible. To capture a wide range of object shapes, any 3D representation must be able to model solid, watertight, shapes as well as thin, open, surfaces. Recent work has focused on the former, and methods for reconstructing open surfaces do not support fast reconstruction with material and lighting or unconditional generative modelling. Inspired by the observation that open surfaces can be seen as islands floating on watertight surfaces, we parameterize open surfaces by defining a manifold signed distance field on watertight templates. With this parameterization, we further develop a grid-based and differentiable representation that parameterizes both watertight and non-watertight meshes of arbitrary topology. Our new representation, called Ghost-on-the-Shell (G-Shell), enables two important applications: differentiable rasterization-based reconstruction from multiview images and generative modelling of non-watertight meshes. We empirically demonstrate that G-Shell achieves state-of-the-art performance on non-watertight mesh reconstruction and generation tasks, while also performing effectively for watertight meshes.

* Technical Report (26 pages, 16 figures, Project Page: https://gshell3d.github.io/)

Via

Access Paper or Ask Questions

Modeling groundwater levels in California's Central Valley by hierarchical Gaussian process and neural network regression

Oct 23, 2023

Anshuman Pradhan, Kyra H. Adams, Venkat Chandrasekaran, Zhen Liu, John T. Reager, Andrew M. Stuart, Michael J. Turmon

Figure 1 for Modeling groundwater levels in California's Central Valley by hierarchical Gaussian process and neural network regression

Figure 2 for Modeling groundwater levels in California's Central Valley by hierarchical Gaussian process and neural network regression

Figure 3 for Modeling groundwater levels in California's Central Valley by hierarchical Gaussian process and neural network regression

Figure 4 for Modeling groundwater levels in California's Central Valley by hierarchical Gaussian process and neural network regression

Abstract:Modeling groundwater levels continuously across California's Central Valley (CV) hydrological system is challenging due to low-quality well data which is sparsely and noisily sampled across time and space. A novel machine learning method is proposed for modeling groundwater levels by learning from a 3D lithological texture model of the CV aquifer. The proposed formulation performs multivariate regression by combining Gaussian processes (GP) and deep neural networks (DNN). Proposed hierarchical modeling approach constitutes training the DNN to learn a lithologically informed latent space where non-parametric regression with GP is performed. The methodology is applied for modeling groundwater levels across the CV during 2015 - 2020. We demonstrate the efficacy of GP-DNN regression for modeling non-stationary features in the well data with fast and reliable uncertainty quantification. Our results indicate that the 2017 and 2019 wet years in California were largely ineffective in replenishing the groundwater loss caused during previous drought years.

* Submitted to Water Resources Research

Via

Access Paper or Ask Questions

Heterogenous Memory Augmented Neural Networks

Oct 17, 2023

Zihan Qiu, Zhen Liu, Shuicheng Yan, Shanghang Zhang, Jie Fu

Figure 1 for Heterogenous Memory Augmented Neural Networks

Figure 2 for Heterogenous Memory Augmented Neural Networks

Figure 3 for Heterogenous Memory Augmented Neural Networks

Figure 4 for Heterogenous Memory Augmented Neural Networks

Abstract:It has been shown that semi-parametric methods, which combine standard neural networks with non-parametric components such as external memory modules and data retrieval, are particularly helpful in data scarcity and out-of-distribution (OOD) scenarios. However, existing semi-parametric methods mostly depend on independent raw data points - this strategy is difficult to scale up due to both high computational costs and the incapacity of current attention mechanisms with a large number of tokens. In this paper, we introduce a novel heterogeneous memory augmentation approach for neural networks which, by introducing learnable memory tokens with attention mechanism, can effectively boost performance without huge computational overhead. Our general-purpose method can be seamlessly combined with various backbones (MLP, CNN, GNN, and Transformer) in a plug-and-play manner. We extensively evaluate our approach on various image and graph-based tasks under both in-distribution (ID) and OOD conditions and show its competitive performance against task-specific state-of-the-art methods. Code is available at \url{https://github.com/qiuzh20/HMA}.

Via

Access Paper or Ask Questions

Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation

Oct 14, 2023

Zhen Liu, Hang Gao, Hao Ma, Shuo Cai, Yunfeng Hu, Ting Qu, Hong Chen, Xun Gong

Figure 1 for Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation

Figure 2 for Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation

Figure 3 for Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation

Figure 4 for Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation

Abstract:Autonomous vehicle (AV) evaluation has been the subject of increased interest in recent years both in industry and in academia. This paper focuses on the development of a novel framework for generating adversarial driving behavior of background vehicle interfering against the AV to expose effective and rational risky events. Specifically, the adversarial behavior is learned by a reinforcement learning (RL) approach incorporated with the cumulative prospect theory (CPT) which allows representation of human risk cognition. Then, the extended version of deep deterministic policy gradient (DDPG) technique is proposed for training the adversarial policy while ensuring training stability as the CPT action-value function is leveraged. A comparative case study regarding the cut-in scenario is conducted on a high fidelity Hardware-in-the-Loop (HiL) platform and the results demonstrate the adversarial effectiveness to infer the weakness of the tested AV.

* We find there is expression error in III.A. A correction edition will be offered

Via

Access Paper or Ask Questions

Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks

Sep 29, 2023

Yanqiao Zhu, Jeehyun Hwang, Keir Adams, Zhen Liu, Bozhao Nan, Brock Stenfors, Yuanqi Du, Jatin Chauhan, Olaf Wiest, Olexandr Isayev(+3 more)

Figure 1 for Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks

Figure 2 for Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks

Figure 3 for Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks

Figure 4 for Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks

Abstract:Molecular Representation Learning (MRL) has proven impactful in numerous biochemical applications such as drug discovery and enzyme design. While Graph Neural Networks (GNNs) are effective at learning molecular representations from a 2D molecular graph or a single 3D structure, existing works often overlook the flexible nature of molecules, which continuously interconvert across conformations via chemical bond rotations and minor vibrational perturbations. To better account for molecular flexibility, some recent works formulate MRL as an ensemble learning problem, focusing on explicitly learning from a set of conformer structures. However, most of these studies have limited datasets, tasks, and models. In this work, we introduce the first MoleculAR Conformer Ensemble Learning (MARCEL) benchmark to thoroughly evaluate the potential of learning on conformer ensembles and suggest promising research directions. MARCEL includes four datasets covering diverse molecule- and reaction-level properties of chemically diverse molecules including organocatalysts and transition-metal catalysts, extending beyond the scope of common GNN benchmarks that are confined to drug-like molecules. In addition, we conduct a comprehensive empirical study, which benchmarks representative 1D, 2D, and 3D molecular representation learning models, along with two strategies that explicitly incorporate conformer ensembles into 3D MRL models. Our findings reveal that direct learning from an accessible conformer space can improve performance on a variety of tasks and models.

* 19 pages

Via

Access Paper or Ask Questions

Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

Aug 16, 2023

Yinglong Wang, Zhen Liu, Jianzhuang Liu, Songcen Xu, Shuaicheng Liu

Figure 1 for Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

Figure 2 for Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

Figure 3 for Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

Figure 4 for Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

Abstract:This paper presents a novel network structure with illumination-aware gamma correction and complete image modelling to solve the low-light image enhancement problem. Low-light environments usually lead to less informative large-scale dark areas, directly learning deep representations from low-light images is insensitive to recovering normal illumination. We propose to integrate the effectiveness of gamma correction with the strong modelling capacities of deep networks, which enables the correction factor gamma to be learned in a coarse to elaborate manner via adaptively perceiving the deviated illumination. Because exponential operation introduces high computational complexity, we propose to use Taylor Series to approximate gamma correction, accelerating the training and inference speed. Dark areas usually occupy large scales in low-light images, common local modelling structures, e.g., CNN, SwinIR, are thus insufficient to recover accurate illumination across whole low-light images. We propose a novel Transformer block to completely simulate the dependencies of all pixels across images via a local-to-global hierarchical attention mechanism, so that dark areas could be inferred by borrowing the information from far informative regions in a highly effective manner. Extensive experiments on several benchmark datasets demonstrate that our approach outperforms state-of-the-art methods.

* Accepted by ICCV 2023

Via

Access Paper or Ask Questions

Self-supervised Learning and Graph Classification under Heterophily

Jun 14, 2023

Yilin Ding, Zhen Liu, Hao Hao

Figure 1 for Self-supervised Learning and Graph Classification under Heterophily

Figure 2 for Self-supervised Learning and Graph Classification under Heterophily

Figure 3 for Self-supervised Learning and Graph Classification under Heterophily

Figure 4 for Self-supervised Learning and Graph Classification under Heterophily

Abstract:Self-supervised learning has shown its promising capability in graph representation learning in recent work. Most existing pre-training strategies usually choose the popular Graph neural networks (GNNs), which can be seen as a special form of low-pass filter, fail to effectively capture heterophily. In this paper, we first present an experimental investigation exploring the performance of low-pass and high-pass filters in heterophily graph classification, where the results clearly show that high-frequency signal is important for learning heterophily graph representation. On the other hand, it is still unclear how to effectively capture the structural pattern of graphs and how to measure the capability of the self-supervised pre-training strategy in capturing graph structure. To address the problem, we first design a quantitative metric to Measure Graph Structure (MGS), which analyzes correlation between structural similarity and embedding similarity of graph pairs. Then, to enhance the graph structural information captured by self-supervised learning, we propose a novel self-supervised strategy for Pre-training GNNs based on the Metric (PGM). Extensive experiments validate our pre-training strategy achieves state-of-the-art performance for molecular property prediction and protein function prediction. In addition, we find choosing the suitable filter sometimes may be better than designing good pre-training strategies for heterophily graph classification.

Via

Access Paper or Ask Questions