Alert button
Picture for Cheng Cheng

Cheng Cheng

Alert button

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model

Nov 07, 2023
Cheng Cheng, Lin Song, Ruoyi Xue, Hang Wang, Hongbin Sun, Yixiao Ge, Ying Shan

The contrastive vision-language pre-training, known as CLIP, demonstrates remarkable potential in perceiving open-world visual concepts, enabling effective zero-shot image recognition. Nevertheless, few-shot learning methods based on CLIP typically require offline fine-tuning of the parameters on few-shot samples, resulting in longer inference time and the risk of over-fitting in certain domains. To tackle these challenges, we propose the Meta-Adapter, a lightweight residual-style adapter, to refine the CLIP features guided by the few-shot samples in an online manner. With a few training samples, our method can enable effective few-shot learning capabilities and generalize to unseen data or tasks without additional fine-tuning, achieving competitive performance and high efficiency. Without bells and whistles, our approach outperforms the state-of-the-art online few-shot learning method by an average of 3.6\% on eight image classification datasets with higher inference speed. Furthermore, our model is simple and flexible, serving as a plug-and-play module directly applicable to downstream tasks. Without further fine-tuning, Meta-Adapter obtains notable performance improvements in open-vocabulary object detection and segmentation tasks.

* Accepted by NeurIPS 2023 
Viaarxiv icon

Skywork: A More Open Bilingual Foundation Model

Oct 30, 2023
Tianwen Wei, Liang Zhao, Lichang Zhang, Bo Zhu, Lijie Wang, Haihua Yang, Biye Li, Cheng Cheng, Weiwei Lü, Rui Hu, Chenxia Li, Liu Yang, Xilin Luo, Xuejie Wu, Lunan Liu, Wenjun Cheng, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Lei Lin, Xiaokun Wang, Yutuan Ma, Chuanhai Dong, Yanqi Sun, Yifu Chen, Yongyi Peng, Xiaojuan Liang, Shuicheng Yan, Han Fang, Yahui Zhou

In this technical report, we present Skywork-13B, a family of large language models (LLMs) trained on a corpus of over 3.2 trillion tokens drawn from both English and Chinese texts. This bilingual foundation model is the most extensively trained and openly published LLMs of comparable size to date. We introduce a two-stage training methodology using a segmented corpus, targeting general purpose training and then domain-specific enhancement training, respectively. We show that our model not only excels on popular benchmarks, but also achieves \emph{state of the art} performance in Chinese language modeling on diverse domains. Furthermore, we propose a novel leakage detection method, demonstrating that test data contamination is a pressing issue warranting further investigation by the LLM community. To spur future research, we release Skywork-13B along with checkpoints obtained during intermediate stages of the training process. We are also releasing part of our SkyPile corpus, a collection of over 150 billion tokens of web text, which is the largest high quality open Chinese pre-training corpus to date. We hope Skywork-13B and our open corpus will serve as a valuable open-source resource to democratize access to high-quality LLMs.

Viaarxiv icon

Random Sampling of Bandlimited Graph Signals from Local Measurements

Oct 18, 2023
Lili Shen, Jun Xian, Cheng Cheng

The random sampling on graph signals is one of the fundamental topics in graph signal processing. In this letter, we consider the random sampling of k-bandlimited signals from the local measurements and show that no more than O(klogk) measurements with replacement are sufficient for the accurate and stable recovery of any k-bandlimited graph signals. We propose two random sampling strategies based on the minimum measurements, i.e., the optimal sampling and the estimated sampling. The geodesic distance between vertices is introduced to design the sampling probability distribution. Numerical experiments are included to show the effectiveness of the proposed methods.

Viaarxiv icon

Graph Propagation Transformer for Graph Representation Learning

May 19, 2023
Zhe Chen, Hao Tan, Tao Wang, Tianrun Shen, Tong Lu, Qiuying Peng, Cheng Cheng, Yue Qi

Figure 1 for Graph Propagation Transformer for Graph Representation Learning
Figure 2 for Graph Propagation Transformer for Graph Representation Learning
Figure 3 for Graph Propagation Transformer for Graph Representation Learning
Figure 4 for Graph Propagation Transformer for Graph Representation Learning

This paper presents a novel transformer architecture for graph representation learning. The core insight of our method is to fully consider the information propagation among nodes and edges in a graph when building the attention module in the transformer blocks. Specifically, we propose a new attention mechanism called Graph Propagation Attention (GPA). It explicitly passes the information among nodes and edges in three ways, i.e. node-to-node, node-to-edge, and edge-to-node, which is essential for learning graph-structured data. On this basis, we design an effective transformer architecture named Graph Propagation Transformer (GPTrans) to further help learn graph data. We verify the performance of GPTrans in a wide range of graph learning experiments on several benchmark datasets. These results show that our method outperforms many state-of-the-art transformer-based graph models with better performance. The code will be released at https://github.com/czczup/GPTrans.

* Accepted to IJCAI 2023 
Viaarxiv icon

Graph Fourier transforms on directed product graphs

Sep 07, 2022
Cheng Cheng, Yang Chen, Jeon Yu Lee, Qiyu Sun

Figure 1 for Graph Fourier transforms on directed product graphs
Figure 2 for Graph Fourier transforms on directed product graphs
Figure 3 for Graph Fourier transforms on directed product graphs
Figure 4 for Graph Fourier transforms on directed product graphs

Graph Fourier transform (GFT) is one of the fundamental tools in graph signal processing to decompose graph signals into different frequency components and to represent graph signals with strong correlation by different modes of variation effectively. The GFT on undirected graphs has been well studied and several approaches have been proposed to define GFTs on directed graphs. In this paper, based on the singular value decompositions of some graph Laplacians, we propose two GFTs on the Cartesian product graph of two directed graphs. We show that the proposed GFTs could represent spatial-temporal data sets on directed networks with strong correlation efficiently, and in the undirected graph setting they are essentially the joint GFT in the literature. In this paper, we also consider the bandlimiting procedure in the spectral domain of the proposed GFTs, and demonstrate its performance to denoise the temperature data set in the region of Brest (France) on January 2014.

Viaarxiv icon

Learning Clinical Concepts for Predicting Risk of Progression to Severe COVID-19

Aug 28, 2022
Helen Zhou, Cheng Cheng, Kelly J. Shields, Gursimran Kochhar, Tariq Cheema, Zachary C. Lipton, Jeremy C. Weiss

Figure 1 for Learning Clinical Concepts for Predicting Risk of Progression to Severe COVID-19
Figure 2 for Learning Clinical Concepts for Predicting Risk of Progression to Severe COVID-19
Figure 3 for Learning Clinical Concepts for Predicting Risk of Progression to Severe COVID-19
Figure 4 for Learning Clinical Concepts for Predicting Risk of Progression to Severe COVID-19

With COVID-19 now pervasive, identification of high-risk individuals is crucial. Using data from a major healthcare provider in Southwestern Pennsylvania, we develop survival models predicting severe COVID-19 progression. In this endeavor, we face a tradeoff between more accurate models relying on many features and less accurate models relying on a few features aligned with clinician intuition. Complicating matters, many EHR features tend to be under-coded, degrading the accuracy of smaller models. In this study, we develop two sets of high-performance risk scores: (i) an unconstrained model built from all available features; and (ii) a pipeline that learns a small set of clinical concepts before training a risk predictor. Learned concepts boost performance over the corresponding features (C-index 0.858 vs. 0.844) and demonstrate improvements over (i) when evaluated out-of-sample (subsequent time periods). Our models outperform previous works (C-index 0.844-0.872 vs. 0.598-0.810).

Viaarxiv icon

Graph Fourier transform based on singular value decomposition of directed Laplacian

May 12, 2022
Yang Chen, Cheng Cheng, Qiyu Sun

Figure 1 for Graph Fourier transform based on singular value decomposition of directed Laplacian
Figure 2 for Graph Fourier transform based on singular value decomposition of directed Laplacian
Figure 3 for Graph Fourier transform based on singular value decomposition of directed Laplacian
Figure 4 for Graph Fourier transform based on singular value decomposition of directed Laplacian

Graph Fourier transform (GFT) is a fundamental concept in graph signal processing. In this paper, based on singular value decomposition of Laplacian, we introduce a novel definition of GFT on directed graphs, and use singular values of Laplacian to carry the notion of graph frequencies. % of the proposed GFT. The proposed GFT is consistent with the conventional GFT in the undirected graph setting, and on directed circulant graphs, the proposed GFT is the classical discrete Fourier transform, up to some rotation, permutation and phase adjustment. We show that frequencies and frequency components of the proposed GFT can be evaluated by solving some constrained minimization problems with low computational cost. Numerical demonstrations indicate that the proposed GFT could represent graph signals with different modes of variation efficiently.

Viaarxiv icon

Wiener filters on graphs and distributed polynomial approximation algorithms

May 09, 2022
Cong Zheng, Cheng Cheng, Qiyu Sun

Figure 1 for Wiener filters on graphs and distributed polynomial approximation algorithms
Figure 2 for Wiener filters on graphs and distributed polynomial approximation algorithms
Figure 3 for Wiener filters on graphs and distributed polynomial approximation algorithms
Figure 4 for Wiener filters on graphs and distributed polynomial approximation algorithms

In this paper, we consider Wiener filters to reconstruct deterministic and (wide-band) stationary graph signals from their observations corrupted by random noises, and we propose distributed algorithms to implement Wiener filters and inverse filters on networks in which agents are equipped with a data processing subsystem for limited data storage and computation power, and with a one-hop communication subsystem for direct data exchange only with their adjacent agents. The proposed distributed polynomial approximation algorithm is an exponential convergent quasi-Newton method based on Jacobi polynomial approximation and Chebyshev interpolation polynomial approximation to analytic functions on a cube. Our numerical simulations show that Wiener filtering procedure performs better on denoising (wide-band) stationary signals than the Tikhonov regularization approach does, and that the proposed polynomial approximation algorithms converge faster than the Chebyshev polynomial approximation algorithm and gradient decent algorithm do in the implementation of an inverse filtering procedure associated with a polynomial filter of commutative graph shifts.

Viaarxiv icon

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Mar 14, 2022
Yun He, Xue Feng, Cheng Cheng, Geng Ji, Yunsong Guo, James Caverlee

Figure 1 for MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks
Figure 2 for MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks
Figure 3 for MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks
Figure 4 for MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

In many personalized recommendation scenarios, the generalization ability of a target task can be improved via learning with additional auxiliary tasks alongside this target task on a multi-task network. However, this method often suffers from a serious optimization imbalance problem. On the one hand, one or more auxiliary tasks might have a larger influence than the target task and even dominate the network weights, resulting in worse recommendation accuracy for the target task. On the other hand, the influence of one or more auxiliary tasks might be too weak to assist the target task. More challenging is that this imbalance dynamically changes throughout the training process and varies across the parts of the same network. We propose a new method: MetaBalance to balance auxiliary losses via directly manipulating their gradients w.r.t the shared parameters in the multi-task network. Specifically, in each training iteration and adaptively for each part of the network, the gradient of an auxiliary loss is carefully reduced or enlarged to have a closer magnitude to the gradient of the target loss, preventing auxiliary tasks from being so strong that dominate the target task or too weak to help the target task. Moreover, the proximity between the gradient magnitudes can be flexibly adjusted to adapt MetaBalance to different scenarios. The experiments show that our proposed method achieves a significant improvement of 8.34% in terms of NDCG@10 upon the strongest baseline on two real-world datasets. The code of our approach can be found at here: https://github.com/facebookresearch/MetaBalance

* Accepted by the WebConf 2022 
Viaarxiv icon

Learning to Adapt to Light

Feb 16, 2022
Kai-Fu Yang, Cheng Cheng, Shi-Xuan Zhao, Xian-Shi Zhang, Yong-Jie Li

Figure 1 for Learning to Adapt to Light
Figure 2 for Learning to Adapt to Light
Figure 3 for Learning to Adapt to Light
Figure 4 for Learning to Adapt to Light

Light adaptation or brightness correction is a key step in improving the contrast and visual appeal of an image. There are multiple light-related tasks (for example, low-light enhancement and exposure correction) and previous studies have mainly investigated these tasks individually. However, it is interesting to consider whether these light-related tasks can be executed by a unified model, especially considering that our visual system adapts to external light in such way. In this study, we propose a biologically inspired method to handle light-related image-enhancement tasks with a unified network (called LA-Net). First, a frequency-based decomposition module is designed to decouple the common and characteristic sub-problems of light-related tasks into two pathways. Then, a new module is built inspired by biological visual adaptation to achieve unified light adaptation in the low-frequency pathway. In addition, noise suppression or detail enhancement is achieved effectively in the high-frequency pathway regardless of the light levels. Extensive experiments on three tasks -- low-light enhancement, exposure correction, and tone mapping -- demonstrate that the proposed method almost obtains state-of-the-art performance compared with recent methods designed for these individual tasks.

* 10 pages, 9 figures 
Viaarxiv icon