Alert button
Picture for Wei Fang

Wei Fang

Alert button

Analyzing and controlling diversity in quantum-behaved particle swarm optimization

Aug 09, 2023
Li-Wei Li, Jun Sun, Chao Li, Wei Fang, Vasile Palade, Xiao-Jun Wu

Figure 1 for Analyzing and controlling diversity in quantum-behaved particle swarm optimization
Figure 2 for Analyzing and controlling diversity in quantum-behaved particle swarm optimization
Figure 3 for Analyzing and controlling diversity in quantum-behaved particle swarm optimization
Figure 4 for Analyzing and controlling diversity in quantum-behaved particle swarm optimization

This paper addresses the issues of controlling and analyzing the population diversity in quantum-behaved particle swarm optimization (QPSO), which is an optimization approach motivated by concepts in quantum mechanics and PSO. In order to gain an in-depth understanding of the role the diversity plays in the evolving process, we first define the genotype diversity by the distance to the average point of the particles' positions and the phenotype diversity by the fitness values for the QPSO. Then, the correlations between the two types of diversities and the search performance are tested and analyzed on several benchmark functions, and the distance-to-average-point diversity is showed to have stronger association with the search performance during the evolving processes. Finally, in the light of the performed diversity analyses, two strategies for controlling the distance-to-average-point diversities are proposed for the purpose of improving the search ability of the QPSO algorithm. Empirical studies on the QPSO with the introduced diversity control methods are performed on a set of benchmark functions from the CEC 2005 benchmark suite. The performance of the proposed methods are evaluated and compared with the original QPSO and other PSO variants.

Viaarxiv icon

Auto-Spikformer: Spikformer Architecture Search

Jun 01, 2023
Kaiwei Che, Zhaokun Zhou, Zhengyu Ma, Wei Fang, Yanqi Chen, Shuaijie Shen, Li Yuan, Yonghong Tian

Figure 1 for Auto-Spikformer: Spikformer Architecture Search
Figure 2 for Auto-Spikformer: Spikformer Architecture Search
Figure 3 for Auto-Spikformer: Spikformer Architecture Search
Figure 4 for Auto-Spikformer: Spikformer Architecture Search

The integration of self-attention mechanisms into Spiking Neural Networks (SNNs) has garnered considerable interest in the realm of advanced deep learning, primarily due to their biological properties. Recent advancements in SNN architecture, such as Spikformer, have demonstrated promising outcomes by leveraging Spiking Self-Attention (SSA) and Spiking Patch Splitting (SPS) modules. However, we observe that Spikformer may exhibit excessive energy consumption, potentially attributable to redundant channels and blocks. To mitigate this issue, we propose Auto-Spikformer, a one-shot Transformer Architecture Search (TAS) method, which automates the quest for an optimized Spikformer architecture. To facilitate the search process, we propose methods Evolutionary SNN neurons (ESNN), which optimizes the SNN parameters, and apply the previous method of weight entanglement supernet training, which optimizes the Vision Transformer (ViT) parameters. Moreover, we propose an accuracy and energy balanced fitness function $\mathcal{F}_{AEB}$ that jointly considers both energy consumption and accuracy, and aims to find a Pareto optimal combination that balances these two objectives. Our experimental results demonstrate the effectiveness of Auto-Spikformer, which outperforms the state-of-the-art method including CNN or ViT models that are manually or automatically designed while significantly reducing energy consumption.

Viaarxiv icon

Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering

May 26, 2023
Yung-Sung Chuang, Wei Fang, Shang-Wen Li, Wen-tau Yih, James Glass

Figure 1 for Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Figure 2 for Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Figure 3 for Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Figure 4 for Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering

We propose EAR, a query Expansion And Reranking approach for improving passage retrieval, with the application to open-domain question answering. EAR first applies a query expansion model to generate a diverse set of queries, and then uses a query reranker to select the ones that could lead to better retrieval results. Motivated by the observation that the best query expansion often is not picked by greedy decoding, EAR trains its reranker to predict the rank orders of the gold passages when issuing the expanded queries to a given retriever. By connecting better the query expansion model and retriever, EAR significantly enhances a traditional sparse retrieval method, BM25. Empirically, EAR improves top-5/20 accuracy by 3-8 and 5-10 points in in-domain and out-of-domain settings, respectively, when compared to a vanilla query expansion model, GAR, and a dense retrieval model, DPR.

* ACL 2023 long paper (Findings) 
Viaarxiv icon

Temporal Contrastive Learning for Spiking Neural Networks

May 23, 2023
Haonan Qiu, Zeyin Song, Yanqi Chen, Munan Ning, Wei Fang, Tao Sun, Zhengyu Ma, Li Yuan, Yonghong Tian

Figure 1 for Temporal Contrastive Learning for Spiking Neural Networks
Figure 2 for Temporal Contrastive Learning for Spiking Neural Networks
Figure 3 for Temporal Contrastive Learning for Spiking Neural Networks
Figure 4 for Temporal Contrastive Learning for Spiking Neural Networks

Biologically inspired spiking neural networks (SNNs) have garnered considerable attention due to their low-energy consumption and spatio-temporal information processing capabilities. Most existing SNNs training methods first integrate output information across time steps, then adopt the cross-entropy (CE) loss to supervise the prediction of the average representations. However, in this work, we find the method above is not ideal for the SNNs training as it omits the temporal dynamics of SNNs and degrades the performance quickly with the decrease of inference time steps. One tempting method to model temporal correlations is to apply the same label supervision at each time step and treat them identically. Although it can acquire relatively consistent performance across various time steps, it still faces challenges in obtaining SNNs with high performance. Inspired by these observations, we propose Temporal-domain supervised Contrastive Learning (TCL) framework, a novel method to obtain SNNs with low latency and high performance by incorporating contrastive supervision with temporal domain information. Contrastive learning (CL) prompts the network to discern both consistency and variability in the representation space, enabling it to better learn discriminative and generalizable features. We extend this concept to the temporal domain of SNNs, allowing us to flexibly and fully leverage the correlation between representations at different time steps. Furthermore, we propose a Siamese Temporal-domain supervised Contrastive Learning (STCL) framework to enhance the SNNs via augmentation, temporal and class constraints simultaneously. Extensive experimental results demonstrate that SNNs trained by our TCL and STCL can achieve both high performance and low latency, achieving state-of-the-art performance on a variety of datasets (e.g., CIFAR-10, CIFAR-100, and DVS-CIFAR10).

Viaarxiv icon

Parallel Spiking Neurons with High Efficiency and Long-term Dependencies Learning Ability

Apr 25, 2023
Wei Fang, Zhaofei Yu, Zhaokun Zhou, Yanqi Chen, Zhengyu Ma, Timothée Masquelier, Yonghong Tian

Figure 1 for Parallel Spiking Neurons with High Efficiency and Long-term Dependencies Learning Ability
Figure 2 for Parallel Spiking Neurons with High Efficiency and Long-term Dependencies Learning Ability
Figure 3 for Parallel Spiking Neurons with High Efficiency and Long-term Dependencies Learning Ability
Figure 4 for Parallel Spiking Neurons with High Efficiency and Long-term Dependencies Learning Ability

Vanilla spiking neurons in Spiking Neural Networks (SNNs) use charge-fire-reset neuronal dynamics, which can only be simulated in serial and can hardly learn long-time dependencies. We find that when removing reset, the neuronal dynamics are reformulated in a non-iterative form and can be parallelized. By rewriting neuronal dynamics without resetting to a general formulation, we propose the Parallel Spiking Neuron (PSN), which uses dense connections between time-steps to maximize the utilization of temporal information. To avoid the use of future inputs for low-latency inference, we add masks on the weights and obtain the masked PSN. By sharing weights across time-steps, the sliding PSN is proposed with the ability to deal with sequences with variant lengths. We evaluate the PSN family on simulation speed and temporal/static data classification, and the results show the overwhelming advantage of the PSN family in efficiency and accuracy. To our best knowledge, this is the first research about parallelizing spiking neurons and can be a cornerstone for the spiking deep learning community. Our codes are available at \url{https://github.com/fangwei123456/Parallel-Spiking-Neuron}.

Viaarxiv icon

Interpretable Unified Language Checking

Apr 07, 2023
Tianhua Zhang, Hongyin Luo, Yung-Sung Chuang, Wei Fang, Luc Gaitskell, Thomas Hartvigsen, Xixin Wu, Danny Fox, Helen Meng, James Glass

Figure 1 for Interpretable Unified Language Checking
Figure 2 for Interpretable Unified Language Checking
Figure 3 for Interpretable Unified Language Checking
Figure 4 for Interpretable Unified Language Checking

Despite recent concerns about undesirable behaviors generated by large language models (LLMs), including non-factual, biased, and hateful language, we find LLMs are inherent multi-task language checkers based on their latent representations of natural and social knowledge. We present an interpretable, unified, language checking (UniLC) method for both human and machine-generated language that aims to check if language input is factual and fair. While fairness and fact-checking tasks have been handled separately with dedicated models, we find that LLMs can achieve high performance on a combination of fact-checking, stereotype detection, and hate speech detection tasks with a simple, few-shot, unified set of prompts. With the ``1/2-shot'' multi-task language checking method proposed in this work, the GPT3.5-turbo model outperforms fully supervised baselines on several language tasks. The simple approach and results suggest that based on strong latent knowledge representations, an LLM can be an adaptive and explainable tool for detecting misinformation, stereotypes, and hate speech.

* 10 + 5 pages 
Viaarxiv icon

Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks

Mar 08, 2023
Tong Bu, Wei Fang, Jianhao Ding, PengLin Dai, Zhaofei Yu, Tiejun Huang

Figure 1 for Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks
Figure 2 for Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks
Figure 3 for Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks
Figure 4 for Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks

Spiking Neural Networks (SNNs) have gained great attraction due to their distinctive properties of low power consumption and fast inference on neuromorphic hardware. As the most effective method to get deep SNNs, ANN-SNN conversion has achieved comparable performance as ANNs on large-scale datasets. Despite this, it requires long time-steps to match the firing rates of SNNs to the activation of ANNs. As a result, the converted SNN suffers severe performance degradation problems with short time-steps, which hamper the practical application of SNNs. In this paper, we theoretically analyze ANN-SNN conversion error and derive the estimated activation function of SNNs. Then we propose the quantization clip-floor-shift activation function to replace the ReLU activation function in source ANNs, which can better approximate the activation function of SNNs. We prove that the expected conversion error between SNNs and ANNs is zero, enabling us to achieve high-accuracy and ultra-low-latency SNNs. We evaluate our method on CIFAR-10/100 and ImageNet datasets, and show that it outperforms the state-of-the-art ANN-SNN and directly trained SNNs in both accuracy and time-steps. To the best of our knowledge, this is the first time to explore high-performance ANN-SNN conversion with ultra-low latency (4 time-steps). Code is available at https://github.com/putshua/SNN\_conversion\_QCFS

* International Conference on Learning Representations (2022)  
Viaarxiv icon

A Unified Framework for Soft Threshold Pruning

Feb 25, 2023
Yanqi Chen, Zhengyu Ma, Wei Fang, Xiawu Zheng, Zhaofei Yu, Yonghong Tian

Figure 1 for A Unified Framework for Soft Threshold Pruning
Figure 2 for A Unified Framework for Soft Threshold Pruning
Figure 3 for A Unified Framework for Soft Threshold Pruning
Figure 4 for A Unified Framework for Soft Threshold Pruning

Soft threshold pruning is among the cutting-edge pruning methods with state-of-the-art performance. However, previous methods either perform aimless searching on the threshold scheduler or simply set the threshold trainable, lacking theoretical explanation from a unified perspective. In this work, we reformulate soft threshold pruning as an implicit optimization problem solved using the Iterative Shrinkage-Thresholding Algorithm (ISTA), a classic method from the fields of sparse recovery and compressed sensing. Under this theoretical framework, all threshold tuning strategies proposed in previous studies of soft threshold pruning are concluded as different styles of tuning $L_1$-regularization term. We further derive an optimal threshold scheduler through an in-depth study of threshold scheduling based on our framework. This scheduler keeps $L_1$-regularization coefficient stable, implying a time-invariant objective function from the perspective of optimization. In principle, the derived pruning algorithm could sparsify any mathematical model trained via SGD. We conduct extensive experiments and verify its state-of-the-art performance on both Artificial Neural Networks (ResNet-50 and MobileNet-V1) and Spiking Neural Networks (SEW ResNet-18) on ImageNet datasets. On the basis of this framework, we derive a family of pruning methods, including sparsify-during-training, early pruning, and pruning at initialization. The code is available at https://github.com/Yanqi-Chen/LATS.

* To appear in the 11th International Conference on Learning Representations (ICLR 2023) 
Viaarxiv icon

Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans

Jan 28, 2023
Jieneng Chen, Yingda Xia, Jiawen Yao, Ke Yan, Jianpeng Zhang, Le Lu, Fakai Wang, Bo Zhou, Mingyan Qiu, Qihang Yu, Mingze Yuan, Wei Fang, Yuxing Tang, Minfeng Xu, Jian Zhou, Yuqian Zhao, Qifeng Wang, Xianghua Ye, Xiaoli Yin, Yu Shi, Xin Chen, Jingren Zhou, Alan Yuille, Zaiyi Liu, Ling Zhang

Figure 1 for Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans
Figure 2 for Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans
Figure 3 for Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans
Figure 4 for Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans

Human readers or radiologists routinely perform full-body multi-organ multi-disease detection and diagnosis in clinical practice, while most medical AI systems are built to focus on single organs with a narrow list of a few diseases. This might severely limit AI's clinical adoption. A certain number of AI models need to be assembled non-trivially to match the diagnostic process of a human reading a CT scan. In this paper, we construct a Unified Tumor Transformer (UniT) model to detect (tumor existence and location) and diagnose (tumor characteristics) eight major cancer-prevalent organs in CT scans. UniT is a query-based Mask Transformer model with the output of multi-organ and multi-tumor semantic segmentation. We decouple the object queries into organ queries, detection queries and diagnosis queries, and further establish hierarchical relationships among the three groups. This clinically-inspired architecture effectively assists inter- and intra-organ representation learning of tumors and facilitates the resolution of these complex, anatomically related multi-organ cancer image reading tasks. UniT is trained end-to-end using a curated large-scale CT images of 10,042 patients including eight major types of cancers and occurring non-cancer tumors (all are pathology-confirmed with 3D tumor masks annotated by radiologists). On the test set of 631 patients, UniT has demonstrated strong performance under a set of clinically relevant evaluation metrics, substantially outperforming both multi-organ segmentation methods and an assembly of eight single-organ expert models in tumor detection, segmentation, and diagnosis. Such a unified multi-cancer image reading model (UniT) can significantly reduce the number of false positives produced by combined multi-system models. This moves one step closer towards a universal high-performance cancer screening tool.

Viaarxiv icon

Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning

Nov 05, 2022
Zhe Liu, Yun Li, Lina Yao, Xiaojun Chang, Wei Fang, Xiaojun Wu, Yi Yang

Figure 1 for Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning
Figure 2 for Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning
Figure 3 for Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning
Figure 4 for Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning

The task of Compositional Zero-Shot Learning (CZSL) is to recognize images of novel state-object compositions that are absent during the training stage. Previous methods of learning compositional embedding have shown effectiveness in closed-world CZSL. However, in Open-World CZSL (OW-CZSL), their performance tends to degrade significantly due to the large cardinality of possible compositions. Some recent works separately predict simple primitives (i.e., states and objects) to reduce cardinality. However, they consider simple primitives as independent probability distributions, ignoring the heavy dependence between states, objects, and compositions. In this paper, we model the dependence of compositions via feasibility and contextuality. Feasibility-dependence refers to the unequal feasibility relations between simple primitives, e.g., \textit{hairy} is more feasible with \textit{cat} than with \textit{building} in the real world. Contextuality-dependence represents the contextual variance in images, e.g., \textit{cat} shows diverse appearances under the state of \textit{dry} and \textit{wet}. We design Semantic Attention (SA) and generative Knowledge Disentanglement (KD) to learn the dependence of feasibility and contextuality, respectively. SA captures semantics in compositions to alleviate impossible predictions, driven by the visual similarity between simple primitives. KD disentangles images into unbiased feature representations, easing contextual bias in predictions. Moreover, we complement the current compositional probability model with feasibility and contextuality in a compatible format. Finally, we conduct comprehensive experiments to analyze and validate the superior or competitive performance of our model, Semantic Attention and knowledge Disentanglement guided Simple Primitives (SAD-SP), on three widely-used benchmark OW-CZSL datasets.

Viaarxiv icon