Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zunlei Feng

Zhejiang University

ViT-Calibrator: Decision Stream Calibration for Vision Transformer

May 05, 2023

Lin Chen, Zhijie Jia, Tian Qiu, Lechao Cheng, Jie Lei, Zunlei Feng, Mingli Song

Abstract:A surge of interest has emerged in utilizing Transformers in diverse vision tasks owing to its formidable performance. However, existing approaches primarily focus on optimizing internal model architecture designs that often entail significant trial and error with high burdens. In this work, we propose a new paradigm dubbed Decision Stream Calibration that boosts the performance of general Vision Transformers. To achieve this, we shed light on the information propagation mechanism in the learning procedure by exploring the correlation between different tokens and the relevance coefficient of multiple dimensions. Upon further analysis, it was discovered that 1) the final decision is associated with tokens of foreground targets, while token features of foreground target will be transmitted into the next layer as much as possible, and the useless token features of background area will be eliminated gradually in the forward propagation. 2) Each category is solely associated with specific sparse dimensions in the tokens. Based on the discoveries mentioned above, we designed a two-stage calibration scheme, namely ViT-Calibrator, including token propagation calibration stage and dimension propagation calibration stage. Extensive experiments on commonly used datasets show that the proposed approach can achieve promising results. The source codes are given in the supplements.

* At present, the paper involves internal projects of the company, and it is not convenient to publish it temporarily, so the article needs to be withdrawn temporarily

Via

Access Paper or Ask Questions

Transition Propagation Graph Neural Networks for Temporal Networks

Apr 15, 2023

Tongya Zheng, Zunlei Feng, Tianli Zhang, Yunzhi Hao, Mingli Song, Xingen Wang, Xinyu Wang, Ji Zhao, Chun Chen

Figure 1 for Transition Propagation Graph Neural Networks for Temporal Networks

Figure 2 for Transition Propagation Graph Neural Networks for Temporal Networks

Figure 3 for Transition Propagation Graph Neural Networks for Temporal Networks

Figure 4 for Transition Propagation Graph Neural Networks for Temporal Networks

Abstract:Researchers of temporal networks (e.g., social networks and transaction networks) have been interested in mining dynamic patterns of nodes from their diverse interactions. Inspired by recently powerful graph mining methods like skip-gram models and Graph Neural Networks (GNNs), existing approaches focus on generating temporal node embeddings sequentially with nodes' sequential interactions. However, the sequential modeling of previous approaches cannot handle the transition structure between nodes' neighbors with limited memorization capacity. Detailedly, an effective method for the transition structures is required to both model nodes' personalized patterns adaptively and capture node dynamics accordingly. In this paper, we propose a method, namely Transition Propagation Graph Neural Networks (TIP-GNN), to tackle the challenges of encoding nodes' transition structures. The proposed TIP-GNN focuses on the bilevel graph structure in temporal networks: besides the explicit interaction graph, a node's sequential interactions can also be constructed as a transition graph. Based on the bilevel graph, TIP-GNN further encodes transition structures by multi-step transition propagation and distills information from neighborhoods by a bilevel graph convolution. Experimental results over various temporal networks reveal the efficiency of our TIP-GNN, with at most 7.2\% improvements of accuracy on temporal link prediction. Extensive ablation studies further verify the effectiveness and limitations of the transition propagation module. Our code is available at \url{https://github.com/doujiang-zheng/TIP-GNN}.

* Published by IEEE Transactions on Neural Networks and Learning Systems, 2022

Via

Access Paper or Ask Questions

Temporal Aggregation and Propagation Graph Neural Networks for Dynamic Representation

Apr 15, 2023

Tongya Zheng, Xinchao Wang, Zunlei Feng, Jie Song, Yunzhi Hao, Mingli Song, Xingen Wang, Xinyu Wang, Chun Chen

Figure 1 for Temporal Aggregation and Propagation Graph Neural Networks for Dynamic Representation

Figure 2 for Temporal Aggregation and Propagation Graph Neural Networks for Dynamic Representation

Figure 3 for Temporal Aggregation and Propagation Graph Neural Networks for Dynamic Representation

Figure 4 for Temporal Aggregation and Propagation Graph Neural Networks for Dynamic Representation

Abstract:Temporal graphs exhibit dynamic interactions between nodes over continuous time, whose topologies evolve with time elapsing. The whole temporal neighborhood of nodes reveals the varying preferences of nodes. However, previous works usually generate dynamic representation with limited neighbors for simplicity, which results in both inferior performance and high latency of online inference. Therefore, in this paper, we propose a novel method of temporal graph convolution with the whole neighborhood, namely Temporal Aggregation and Propagation Graph Neural Networks (TAP-GNN). Specifically, we firstly analyze the computational complexity of the dynamic representation problem by unfolding the temporal graph in a message-passing paradigm. The expensive complexity motivates us to design the AP (aggregation and propagation) block, which significantly reduces the repeated computation of historical neighbors. The final TAP-GNN supports online inference in the graph stream scenario, which incorporates the temporal information into node embeddings with a temporal activation function and a projection layer besides several AP blocks. Experimental results on various real-life temporal networks show that our proposed TAP-GNN outperforms existing temporal graph methods by a large margin in terms of both predictive performance and online inference latency. Our code is available at \url{https://github.com/doujiang-zheng/TAP-GNN}.

* Published by IEEE Transactions on Knowledge and Data Engineering, April 2023. We have updated the latest experimental results of baselines

Via

Access Paper or Ask Questions

Life Regression based Patch Slimming for Vision Transformers

Apr 11, 2023

Jiawei Chen, Lin Chen, Jiang Yang, Tianqi Shi, Lechao Cheng, Zunlei Feng, Mingli Song

Abstract:Vision transformers have achieved remarkable success in computer vision tasks by using multi-head self-attention modules to capture long-range dependencies within images. However, the high inference computation cost poses a new challenge. Several methods have been proposed to address this problem, mainly by slimming patches. In the inference stage, these methods classify patches into two classes, one to keep and the other to discard in multiple layers. This approach results in additional computation at every layer where patches are discarded, which hinders inference acceleration. In this study, we tackle the patch slimming problem from a different perspective by proposing a life regression module that determines the lifespan of each image patch in one go. During inference, the patch is discarded once the current layer index exceeds its life. Our proposed method avoids additional computation and parameters in multiple layers to enhance inference speed while maintaining competitive performance. Additionally, our approach requires fewer training epochs than other patch slimming methods.

* 8 pages,4 figures

Via

Access Paper or Ask Questions

Propheter: Prophetic Teacher Guided Long-Tailed Distribution Learning

Apr 09, 2023

Wenxiang Xu, Linyun Zhou, Lin Chen, Lechao Cheng, Jie Lei, Zunlei Feng, Mingli Song

Abstract:The problem of deep long-tailed learning, a prevalent challenge in the realm of generic visual recognition, persists in a multitude of real-world applications. To tackle the heavily-skewed dataset issue in long-tailed classification, prior efforts have sought to augment existing deep models with the elaborate class-balancing strategies, such as class rebalancing, data augmentation, and module improvement. Despite the encouraging performance, the limited class knowledge of the tailed classes in the training dataset still bottlenecks the performance of the existing deep models. In this paper, we propose an innovative long-tailed learning paradigm that breaks the bottleneck by guiding the learning of deep networks with external prior knowledge. This is specifically achieved by devising an elaborated ``prophetic'' teacher, termed as ``Propheter'', that aims to learn the potential class distributions. The target long-tailed prediction model is then optimized under the instruction of the well-trained ``Propheter'', such that the distributions of different classes are as distinguishable as possible from each other. Experiments on eight long-tailed benchmarks across three architectures demonstrate that the proposed prophetic paradigm acts as a promising solution to the challenge of limited class knowledge in long-tailed datasets. Our code and model can be found in the supplementary material.

Via

Access Paper or Ask Questions

Team DETR: Guide Queries as a Professional Team in Detection Transformers

Feb 28, 2023

Tian Qiu, Linyun Zhou, Wenxiang Xu, Lechao Cheng, Zunlei Feng, Mingli Song

Figure 1 for Team DETR: Guide Queries as a Professional Team in Detection Transformers

Figure 2 for Team DETR: Guide Queries as a Professional Team in Detection Transformers

Figure 3 for Team DETR: Guide Queries as a Professional Team in Detection Transformers

Figure 4 for Team DETR: Guide Queries as a Professional Team in Detection Transformers

Abstract:Recent proposed DETR variants have made tremendous progress in various scenarios due to their streamlined processes and remarkable performance. However, the learned queries usually explore the global context to generate the final set prediction, resulting in redundant burdens and unfaithful results. More specifically, a query is commonly responsible for objects of different scales and positions, which is a challenge for the query itself, and will cause spatial resource competition among queries. To alleviate this issue, we propose Team DETR, which leverages query collaboration and position constraints to embrace objects of interest more precisely. We also dynamically cater to each query member's prediction preference, offering the query better scale and spatial priors. In addition, the proposed Team DETR is flexible enough to be adapted to other existing DETR variants without increasing parameters and calculations. Extensive experiments on the COCO dataset show that Team DETR achieves remarkable gains, especially for small and large objects. Code is available at \url{https://github.com/horrible-dong/TeamDETR}.

Via

Access Paper or Ask Questions

Model Doctor for Diagnosing and Treating Segmentation Error

Feb 23, 2023

Zhijie Jia, Lin Chen, Kaiwen Hu, Lechao Cheng, Zunlei Feng, Mingli Song

Abstract:Despite the remarkable progress in semantic segmentation tasks with the advancement of deep neural networks, existing U-shaped hierarchical typical segmentation networks still suffer from local misclassification of categories and inaccurate target boundaries. In an effort to alleviate this issue, we propose a Model Doctor for semantic segmentation problems. The Model Doctor is designed to diagnose the aforementioned problems in existing pre-trained models and treat them without introducing additional data, with the goal of refining the parameters to achieve better performance. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our method. Code is available at \url{https://github.com/zhijiejia/SegDoctor}.

Via

Access Paper or Ask Questions

Recent advances in artificial intelligence for retrosynthesis

Jan 14, 2023

Zipeng Zhong, Jie Song, Zunlei Feng, Tiantao Liu, Lingxiang Jia, Shaolun Yao, Tingjun Hou, Mingli Song

Figure 1 for Recent advances in artificial intelligence for retrosynthesis

Figure 2 for Recent advances in artificial intelligence for retrosynthesis

Figure 3 for Recent advances in artificial intelligence for retrosynthesis

Figure 4 for Recent advances in artificial intelligence for retrosynthesis

Abstract:Retrosynthesis is the cornerstone of organic chemistry, providing chemists in material and drug manufacturing access to poorly available and brand-new molecules. Conventional rule-based or expert-based computer-aided synthesis has obvious limitations, such as high labor costs and limited search space. In recent years, dramatic breakthroughs driven by artificial intelligence have revolutionized retrosynthesis. Here we aim to present a comprehensive review of recent advances in AI-based retrosynthesis. For single-step and multi-step retrosynthesis both, we first list their goal and provide a thorough taxonomy of existing methods. Afterwards, we analyze these methods in terms of their mechanism and performance, and introduce popular evaluation metrics for them, in which we also provide a detailed comparison among representative methods on several public datasets. In the next part we introduce popular databases and established platforms for retrosynthesis. Finally, this review concludes with a discussion about promising research directions in this field.

* 27 pages, 6 figurs, 4 tables

Via

Access Paper or Ask Questions

Conservative-Progressive Collaborative Learning for Semi-supervised Semantic Segmentation

Nov 30, 2022

Siqi Fan, Fenghua Zhu, Zunlei Feng, Yisheng Lv, Mingli Song, Fei-Yue Wang

Abstract:Pseudo supervision is regarded as the core idea in semi-supervised learning for semantic segmentation, and there is always a tradeoff between utilizing only the high-quality pseudo labels and leveraging all the pseudo labels. Addressing that, we propose a novel learning approach, called Conservative-Progressive Collaborative Learning (CPCL), among which two predictive networks are trained in parallel, and the pseudo supervision is implemented based on both the agreement and disagreement of the two predictions. One network seeks common ground via intersection supervision and is supervised by the high-quality labels to ensure a more reliable supervision, while the other network reserves differences via union supervision and is supervised by all the pseudo labels to keep exploring with curiosity. Thus, the collaboration of conservative evolution and progressive exploration can be achieved. To reduce the influences of the suspicious pseudo labels, the loss is dynamic re-weighted according to the prediction confidence. Extensive experiments demonstrate that CPCL achieves state-of-the-art performance for semi-supervised semantic segmentation.

Via

Access Paper or Ask Questions

Transferability Estimation Based On Principal Gradient Expectation

Nov 30, 2022

Huiyan Qi, Lechao Cheng, Jingjing Chen, Yue Yu, Zunlei Feng, Yu-Gang Jiang

Abstract:Deep transfer learning has been widely used for knowledge transmission in recent years. The standard approach of pre-training and subsequently fine-tuning, or linear probing, has shown itself to be effective in many down-stream tasks. Therefore, a challenging and ongoing question arises: how to quantify cross-task transferability that is compatible with transferred results while keeping self-consistency? Existing transferability metrics are estimated on the particular model by conversing source and target tasks. They must be recalculated with all existing source tasks whenever a novel unknown target task is encountered, which is extremely computationally expensive. In this work, we highlight what properties should be satisfied and evaluate existing metrics in light of these characteristics. Building upon this, we propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks. Specifically, we use a restart scheme to calculate every batch gradient over each weight unit more than once, and then we take the average of all the gradients to get the expectation. Thus, the transferability between the source and target task is estimated by computing the distance of normalized principal gradients. Extensive experiments show that the proposed transferability metric is more stable, reliable and efficient than SOTA methods.

* 13 pages, 3 figures, 9 tables

Via

Access Paper or Ask Questions