Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhe Wang

Beijing University of Posts and Telecommunications

Evaluation and Control Model Design of Human Factors for Autonomous Driving Systems

Jul 03, 2023

Weishun Deng, Fan Yu, Zhe Wang, Dengbo He

Abstract:With the fast development of driving automation technologies, user psychological acceptance of driving automation has become one of the major obstacles to the adoption of the driving automation technology. The most basic function of a passenger car is to transport passengers or drivers to their destinations safely and comfortably. Thus, the design of the driving automation should not just guarantee the safety of vehicle operation but also ensure occupant subjective level of comfort. Hence this paper proposes a local path planning algorithm for obstacle avoidance with occupant subjective feelings considered. Firstly, turning and obstacle avoidance conditions are designed, and four classifiers in machine learning are used to respectively establish subjective and objective evaluation models that link the objective vehicle dynamics parameters and occupant subjective confidence. Then, two potential fields are established based on the artificial potential field, reflecting the psychological feeling of drivers on obstacles and road boundaries. Accordingly, a path planning algorithm and a path tracking algorithm are designed respectively based on model predictive control, and the psychological safety boundary and the optimal classifier are used as part of cost functions. Finally, co-simulations of MATLAB/Simulink and CarSim are carried out. The results confirm the effectiveness of the proposed control algorithm, which can avoid obstacles satisfactorily and improve the psychological feeling of occupants effectively.

* 102nd Transportation Research Board Annual Meeting

Via

Access Paper or Ask Questions

An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

Jun 28, 2023

Haihao Shen, Hengyu Meng, Bo Dong, Zhe Wang, Ofir Zafrir, Yi Ding, Yu Luo, Hanwen Chang, Qun Gao, Ziheng Wang(+2 more)

Figure 1 for An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

Figure 2 for An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

Figure 3 for An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

Figure 4 for An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

Abstract:In recent years, Transformer-based language models have become the standard approach for natural language processing tasks. However, stringent throughput and latency requirements in industrial applications are limiting their adoption. To mitigate the gap, model compression techniques such as structured pruning are being used to improve inference efficiency. However, most existing neural network inference runtimes lack adequate support for structured sparsity. In this paper, we propose an efficient sparse deep learning inference software stack for Transformer-based language models where the weights are pruned with constant block size. Our sparse software accelerator leverages Intel Deep Learning Boost to maximize the performance of sparse matrix - dense matrix multiplication (commonly abbreviated as SpMM) on CPUs. Our SpMM kernel outperforms the existing sparse libraries (oneMKL, TVM, and LIBXSMM) by an order of magnitude on a wide range of GEMM shapes under 5 representative sparsity ratios (70%, 75%, 80%, 85%, 90%). Moreover, our SpMM kernel shows up to 5x speedup over dense GEMM kernel of oneDNN, a well-optimized dense library widely used in industry. We apply our sparse accelerator on widely-used Transformer-based language models including Bert-Mini, DistilBERT, Bert-Base, and BERT-Large. Our sparse inference software shows up to 1.5x speedup over Neural Magic's Deepsparse under same configurations on Xeon on Amazon Web Services under proxy production latency constraints. We also compare our solution with two framework-based inference solutions, ONNX Runtime and PyTorch, and demonstrate up to 37x speedup over ONNX Runtime and 345x over PyTorch on Xeon under the latency constraints. All the source code is publicly available on Github: https://github.com/intel/intel-extension-for-transformers.

Via

Access Paper or Ask Questions

Federated Learning-based Vehicle Trajectory Prediction against Cyberattacks

Jun 14, 2023

Zhe Wang, Tingkai Yan

Abstract:With the development of the Internet of Vehicles (IoV), vehicle wireless communication poses serious cybersecurity challenges. Faulty information, such as fake vehicle positions and speeds sent by surrounding vehicles, could cause vehicle collisions, traffic jams, and even casualties. Additionally, private vehicle data leakages, such as vehicle trajectory and user account information, may damage user property and security. Therefore, achieving a cyberattack-defense scheme in the IoV system with faulty data saturation is necessary. This paper proposes a Federated Learning-based Vehicle Trajectory Prediction Algorithm against Cyberattacks (FL-TP) to address the above problems. The FL-TP is intensively trained and tested using a publicly available Vehicular Reference Misbehavior (VeReMi) dataset with five types of cyberattacks: constant, constant offset, random, random offset, and eventual stop. The results show that the proposed FL-TP algorithm can improve cyberattack detection and trajectory prediction by up to 6.99% and 54.86%, respectively, under the maximum cyberattack permeability scenarios compared with benchmark methods.

Via

Access Paper or Ask Questions

Muti-Scale And Token Mergence: Make Your ViT More Efficient

Jun 08, 2023

Zhe Bian, Zhe Wang, Wenqiang Han, Kangping Wang

Figure 1 for Muti-Scale And Token Mergence: Make Your ViT More Efficient

Figure 2 for Muti-Scale And Token Mergence: Make Your ViT More Efficient

Figure 3 for Muti-Scale And Token Mergence: Make Your ViT More Efficient

Figure 4 for Muti-Scale And Token Mergence: Make Your ViT More Efficient

Abstract:Since its inception, Vision Transformer (ViT) has emerged as a prevalent model in the computer vision domain. Nonetheless, the multi-head self-attention (MHSA) mechanism in ViT is computationally expensive due to its calculation of relationships among all tokens. Although some techniques mitigate computational overhead by discarding tokens, this also results in the loss of potential information from those tokens. To tackle these issues, we propose a novel token pruning method that retains information from non-crucial tokens by merging them with more crucial tokens, thereby mitigating the impact of pruning on model performance. Crucial and non-crucial tokens are identified by their importance scores and merged based on similarity scores. Furthermore, multi-scale features are exploited to represent images, which are fused prior to token pruning to produce richer feature representations. Importantly, our method can be seamlessly integrated with various ViTs, enhancing their adaptability. Experimental evidence substantiates the efficacy of our approach in reducing the influence of token pruning on model performance. For instance, on the ImageNet dataset, it achieves a remarkable 33% reduction in computational costs while only incurring a 0.1% decrease in accuracy on DeiT-S.

Via

Access Paper or Ask Questions

Progression Cognition Reinforcement Learning with Prioritized Experience for Multi-Vehicle Pursuit

Jun 08, 2023

Xinhang Li, Yiying Yang, Zheng Yuan, Zhe Wang, Qinwen Wang, Chen Xu, Lei Li, Jianhua He, Lin Zhang

Figure 1 for Progression Cognition Reinforcement Learning with Prioritized Experience for Multi-Vehicle Pursuit

Figure 2 for Progression Cognition Reinforcement Learning with Prioritized Experience for Multi-Vehicle Pursuit

Figure 3 for Progression Cognition Reinforcement Learning with Prioritized Experience for Multi-Vehicle Pursuit

Figure 4 for Progression Cognition Reinforcement Learning with Prioritized Experience for Multi-Vehicle Pursuit

Abstract:Multi-vehicle pursuit (MVP) such as autonomous police vehicles pursuing suspects is important but very challenging due to its mission and safety critical nature. While multi-agent reinforcement learning (MARL) algorithms have been proposed for MVP problem in structured grid-pattern roads, the existing algorithms use randomly training samples in centralized learning, which leads to homogeneous agents showing low collaboration performance. For the more challenging problem of pursuing multiple evading vehicles, these algorithms typically select a fixed target evading vehicle for pursuing vehicles without considering dynamic traffic situation, which significantly reduces pursuing success rate. To address the above problems, this paper proposes a Progression Cognition Reinforcement Learning with Prioritized Experience for MVP (PEPCRL-MVP) in urban multi-intersection dynamic traffic scenes. PEPCRL-MVP uses a prioritization network to assess the transitions in the global experience replay buffer according to the parameters of each MARL agent. With the personalized and prioritized experience set selected via the prioritization network, diversity is introduced to the learning process of MARL, which can improve collaboration and task related performance. Furthermore, PEPCRL-MVP employs an attention module to extract critical features from complex urban traffic environments. These features are used to develop progression cognition method to adaptively group pursuing vehicles. Each group efficiently target one evading vehicle in dynamic driving environments. Extensive experiments conducted with a simulator over unstructured roads of an urban area show that PEPCRL-MVP is superior to other state-of-the-art methods. Specifically, PEPCRL-MVP improves pursuing efficiency by 3.95% over TD3-DMAP and its success rate is 34.78% higher than that of MADDPG. Codes are open sourced.

Via

Access Paper or Ask Questions

COPR: Consistency-Oriented Pre-Ranking for Online Advertising

Jun 06, 2023

Zhishan Zhao, Jingyue Gao, Yu Zhang, Shuguang Han, Siyuan Lou, Xiang-Rong Sheng, Zhe Wang, Han Zhu, Yuning Jiang, Jian Xu(+1 more)

Figure 1 for COPR: Consistency-Oriented Pre-Ranking for Online Advertising

Figure 2 for COPR: Consistency-Oriented Pre-Ranking for Online Advertising

Figure 3 for COPR: Consistency-Oriented Pre-Ranking for Online Advertising

Figure 4 for COPR: Consistency-Oriented Pre-Ranking for Online Advertising

Abstract:Cascading architecture has been widely adopted in large-scale advertising systems to balance efficiency and effectiveness. In this architecture, the pre-ranking model is expected to be a lightweight approximation of the ranking model, which handles more candidates with strict latency requirements. Due to the gap in model capacity, the pre-ranking and ranking models usually generate inconsistent ranked results, thus hurting the overall system effectiveness. The paradigm of score alignment is proposed to regularize their raw scores to be consistent. However, it suffers from inevitable alignment errors and error amplification by bids when applied in online advertising. To this end, we introduce a consistency-oriented pre-ranking framework for online advertising, which employs a chunk-based sampling module and a plug-and-play rank alignment module to explicitly optimize consistency of ECPM-ranked results. A $\Delta NDCG$-based weighting mechanism is adopted to better distinguish the importance of inter-chunk samples in optimization. Both online and offline experiments have validated the superiority of our framework. When deployed in Taobao display advertising system, it achieves an improvement of up to +12.3\% CTR and +5.6\% RPM.

Via

Access Paper or Ask Questions

Graphy Analysis Using a GPU-based Parallel Algorithm: Quantum Clustering

May 24, 2023

Zhe Wang, ZhiJie He, Ding Liu

Abstract:The article introduces a new method for applying Quantum Clustering to graph structures. Quantum Clustering (QC) is a novel density-based unsupervised learning method that determines cluster centers by constructing a potential function. In this method, we use the Graph Gradient Descent algorithm to find the centers of clusters. GPU parallelization is utilized for computing potential values. We also conducted experiments on five widely used datasets and evaluated using four indicators. The results show superior performance of the method. Finally, we discuss the influence of $\sigma$ on the experimental results.

Via

Access Paper or Ask Questions

A Lightweight Domain Adversarial Neural Network Based on Knowledge Distillation for EEG-based Cross-subject Emotion Recognition

May 12, 2023

Zhe Wang, Yongxiong Wang, Jiapeng Zhang, Yiheng Tang, Zhiqun Pan

Figure 1 for A Lightweight Domain Adversarial Neural Network Based on Knowledge Distillation for EEG-based Cross-subject Emotion Recognition

Figure 2 for A Lightweight Domain Adversarial Neural Network Based on Knowledge Distillation for EEG-based Cross-subject Emotion Recognition

Figure 3 for A Lightweight Domain Adversarial Neural Network Based on Knowledge Distillation for EEG-based Cross-subject Emotion Recognition

Figure 4 for A Lightweight Domain Adversarial Neural Network Based on Knowledge Distillation for EEG-based Cross-subject Emotion Recognition

Abstract:Individual differences of Electroencephalogram (EEG) could cause the domain shift which would significantly degrade the performance of cross-subject strategy. The domain adversarial neural networks (DANN), where the classification loss and domain loss jointly update the parameters of feature extractor, are adopted to deal with the domain shift. However, limited EEG data quantity and strong individual difference are challenges for the DANN with cumbersome feature extractor. In this work, we propose knowledge distillation (KD) based lightweight DANN to enhance cross-subject EEG-based emotion recognition. Specifically, the teacher model with strong context learning ability is utilized to learn complex temporal dynamics and spatial correlations of EEG, and robust lightweight student model is guided by the teacher model to learn more difficult domain-invariant features. In the feature-based KD framework, a transformer-based hierarchical temporalspatial learning model is served as the teacher model. The student model, which is composed of Bi-LSTM units, is a lightweight version of the teacher model. Hence, the student model could be supervised to mimic the robust feature representations of teacher model by leveraging complementary latent temporal features and spatial features. In the DANN-based cross-subject emotion recognition, we combine the obtained student model and a lightweight temporal-spatial feature interaction module as the feature extractor. And the feature aggregation is fed to the emotion classifier and domain classifier for domain-invariant feature learning. To verify the effectiveness of the proposed method, we conduct the subject-independent experiments on the public dataset DEAP with arousal and valence classification. The outstanding performance and t-SNE visualization of latent features verify the advantage and effectiveness of the proposed method.

Via

Access Paper or Ask Questions

Blockchained Federated Learning for Internet of Things: A Comprehensive Survey

May 08, 2023

Yanna Jiang, Baihe Ma, Xu Wang, Ping Yu, Guangsheng Yu, Zhe Wang, Wei Ni, Ren Ping Liu

Figure 1 for Blockchained Federated Learning for Internet of Things: A Comprehensive Survey

Figure 2 for Blockchained Federated Learning for Internet of Things: A Comprehensive Survey

Figure 3 for Blockchained Federated Learning for Internet of Things: A Comprehensive Survey

Figure 4 for Blockchained Federated Learning for Internet of Things: A Comprehensive Survey

Abstract:The demand for intelligent industries and smart services based on big data is rising rapidly with the increasing digitization and intelligence of the modern world. This survey comprehensively reviews Blockchained Federated Learning (BlockFL) that joins the benefits of both Blockchain and Federated Learning to provide a secure and efficient solution for the demand. We compare the existing BlockFL models in four Internet-of-Things (IoT) application scenarios: Personal IoT (PIoT), Industrial IoT (IIoT), Internet of Vehicles (IoV), and Internet of Health Things (IoHT), with a focus on security and privacy, trust and reliability, efficiency, and data heterogeneity. Our analysis shows that the features of decentralization and transparency make BlockFL a secure and effective solution for distributed model training, while the overhead and compatibility still need further study. It also reveals the unique challenges of each domain presents unique challenges, e.g., the requirement of accommodating dynamic environments in IoV and the high demands of identity and permission management in IoHT, in addition to some common challenges identified, such as privacy, resource constraints, and data heterogeneity. Furthermore, we examine the existing technologies that can benefit BlockFL, thereby helping researchers and practitioners to make informed decisions about the selection and development of BlockFL for various IoT application scenarios.

Via

Access Paper or Ask Questions

PGrad: Learning Principal Gradients For Domain Generalization

May 02, 2023

Zhe Wang, Jake Grigsby, Yanjun Qi

Abstract:Machine learning models fail to perform when facing out-of-distribution (OOD) domains, a challenging task known as domain generalization (DG). In this work, we develop a novel DG training strategy, we call PGrad, to learn a robust gradient direction, improving models' generalization ability on unseen domains. The proposed gradient aggregates the principal directions of a sampled roll-out optimization trajectory that measures the training dynamics across all training domains. PGrad's gradient design forces the DG training to ignore domain-dependent noise signals and updates all training domains with a robust direction covering main components of parameter dynamics. We further improve PGrad via bijection-based computational refinement and directional plus length-based calibrations. Our theoretical proof connects PGrad to the spectral analysis of Hessian in training neural networks. Experiments on DomainBed and WILDS benchmarks demonstrate that our approach effectively enables robust DG optimization and leads to smoothly decreased loss curves. Empirically, PGrad achieves competitive results across seven datasets, demonstrating its efficacy across both synthetic and real-world distributional shifts. Code is available at https://github.com/QData/PGrad.

Via

Access Paper or Ask Questions