Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ding Li

Planning-inspired Hierarchical Trajectory Prediction for Autonomous Driving

Apr 22, 2023

Ding Li, Qichao Zhang, Zhongpu Xia, Kuan Zhang, Menglong Yi, Wenda Jin, Dongbin Zhao

Figure 1 for Planning-inspired Hierarchical Trajectory Prediction for Autonomous Driving

Figure 2 for Planning-inspired Hierarchical Trajectory Prediction for Autonomous Driving

Figure 3 for Planning-inspired Hierarchical Trajectory Prediction for Autonomous Driving

Figure 4 for Planning-inspired Hierarchical Trajectory Prediction for Autonomous Driving

Abstract:Recently, anchor-based trajectory prediction methods have shown promising performance, which directly selects a final set of anchors as future intents in the spatio-temporal coupled space. However, such methods typically neglect a deeper semantic interpretation of path intents and suffer from inferior performance under the imperfect High-Definition (HD) map. To address this challenge, we propose a novel Planning-inspired Hierarchical (PiH) trajectory prediction framework that selects path and speed intents through a hierarchical lateral and longitudinal decomposition. Especially, a hybrid lateral predictor is presented to select a set of fixed-distance lateral paths from map-based road-following and cluster-based free-move path candidates. {Then, the subsequent longitudinal predictor selects plausible goals sampled from a set of lateral paths as speed intents.} Finally, a trajectory decoder is given to generate future trajectories conditioned on a categorical distribution over lateral-longitudinal intents. Experiments demonstrate that PiH achieves competitive and more balanced results against state-of-the-art methods on the Argoverse motion forecasting benchmark and has the strongest robustness under the imperfect HD map.

* 9 pages, 4 figures

Via

Access Paper or Ask Questions

Active Learning with Effective Scoring Functions for Semi-Supervised Temporal Action Localization

Aug 31, 2022

Ding Li, Xuebing Yang, Yongqiang Tang, Chenyang Zhang, Wensheng Zhang

Figure 1 for Active Learning with Effective Scoring Functions for Semi-Supervised Temporal Action Localization

Figure 2 for Active Learning with Effective Scoring Functions for Semi-Supervised Temporal Action Localization

Figure 3 for Active Learning with Effective Scoring Functions for Semi-Supervised Temporal Action Localization

Figure 4 for Active Learning with Effective Scoring Functions for Semi-Supervised Temporal Action Localization

Abstract:Temporal Action Localization (TAL) aims to predict both action category and temporal boundary of action instances in untrimmed videos, i.e., start and end time. Fully-supervised solutions are usually adopted in most existing works, and proven to be effective. One of the practical bottlenecks in these solutions is the large amount of labeled training data required. To reduce expensive human label cost, this paper focuses on a rarely investigated yet practical task named semi-supervised TAL and proposes an effective active learning method, named AL-STAL. We leverage four steps for actively selecting video samples with high informativeness and training the localization model, named \emph{Train, Query, Annotate, Append}. Two scoring functions that consider the uncertainty of localization model are equipped in AL-STAL, thus facilitating the video sample rank and selection. One takes entropy of predicted label distribution as measure of uncertainty, named Temporal Proposal Entropy (TPE). And the other introduces a new metric based on mutual information between adjacent action proposals and evaluates the informativeness of video samples, named Temporal Context Inconsistency (TCI). To validate the effectiveness of proposed method, we conduct extensive experiments on two benchmark datasets THUMOS'14 and ActivityNet 1.3. Experiment results show that AL-STAL outperforms the existing competitors and achieves satisfying performance compared with fully-supervised learning.

* 12 pages, 7 figures, submitted to TCSVT. arXiv admin note: text overlap with arXiv:2103.13137 by other authors

Via

Access Paper or Ask Questions

Attentive pooling for Group Activity Recognition

Aug 31, 2022

Ding Li, Yuan Xie, Wensheng Zhang, Yongqiang Tang, Zhizhong Zhang

Figure 1 for Attentive pooling for Group Activity Recognition

Figure 2 for Attentive pooling for Group Activity Recognition

Figure 3 for Attentive pooling for Group Activity Recognition

Figure 4 for Attentive pooling for Group Activity Recognition

Abstract:In group activity recognition, hierarchical framework is widely adopted to represent the relationships between individuals and their corresponding group, and has achieved promising performance. However, the existing methods simply employed max/average pooling in this framework, which ignored the distinct contributions of different individuals to the group activity recognition. In this paper, we propose a new contextual pooling scheme, named attentive pooling, which enables the weighted information transition from individual actions to group activity. By utilizing the attention mechanism, the attentive pooling is intrinsically interpretable and able to embed member context into the existing hierarchical model. In order to verify the effectiveness of the proposed scheme, two specific attentive pooling methods, i.e., global attentive pooling (GAP) and hierarchical attentive pooling (HAP) are designed. GAP rewards the individuals that are significant to group activity, while HAP further considers the hierarchical division by introducing subgroup structure. The experimental results on the benchmark dataset demonstrate that our proposal is significantly superior beyond the baseline and is comparable to the state-of-the-art methods.

* 7 pages, 7 figures

Via

Access Paper or Ask Questions

Diagnostic Communication and Visual System based on Vehicle UDS Protocol

Jun 25, 2022

Hong Zhang, Ding Li

Abstract:Unified Diagnostic Services (UDS) is a diagnostic communication protocol used in electronic control units (ECUs) within automotive electronics, which is specified in the ISO 14229-1. It is derived from ISO 14230-3 (KWP2000) and the now obsolete ISO 15765-3 (Diagnostic Communication over Controller Area Network (DoCAN). 'Unified' in this context means that it is an international and not a company-specific standard. By now this communication protocol is used in all new ECUs made by Tier 1 suppliers of Original Equipment Manufacturer (OEM), and is incorporated into other standards, such as AUTOSAR. The ECUs in modern vehicles control nearly all functions, including electronic fuel injection (EFI), engine control, the transmission, anti-lock braking system, door locks, braking, window operation, and more.

Via

Access Paper or Ask Questions

Multi-scale 2D Representation Learning for weakly-supervised moment retrieval

Nov 04, 2021

Ding Li, Rui Wu, Yongqiang Tang, Zhizhong Zhang, Wensheng Zhang

Figure 1 for Multi-scale 2D Representation Learning for weakly-supervised moment retrieval

Figure 2 for Multi-scale 2D Representation Learning for weakly-supervised moment retrieval

Figure 3 for Multi-scale 2D Representation Learning for weakly-supervised moment retrieval

Figure 4 for Multi-scale 2D Representation Learning for weakly-supervised moment retrieval

Abstract:Video moment retrieval aims to search the moment most relevant to a given language query. However, most existing methods in this community often require temporal boundary annotations which are expensive and time-consuming to label. Hence weakly supervised methods have been put forward recently by only using coarse video-level label. Despite effectiveness, these methods usually process moment candidates independently, while ignoring a critical issue that the natural temporal dependencies between candidates in different temporal scales. To cope with this issue, we propose a Multi-scale 2D Representation Learning method for weakly supervised video moment retrieval. Specifically, we first construct a two-dimensional map for each temporal scale to capture the temporal dependencies between candidates. Two dimensions in this map indicate the start and end time points of these candidates. Then, we select top-K candidates from each scale-varied map with a learnable convolutional neural network. With a newly designed Moments Evaluation Module, we obtain the alignment scores of the selected candidates. At last, the similarity between captions and language query is served as supervision for further training the candidates' selector. Experiments on two benchmark datasets Charades-STA and ActivityNet Captions demonstrate that our approach achieves superior performance to state-of-the-art results.

* 8 pages, 4 figuers. Accepted for publication in 2020 25th International Conference on Pattern Recognition (ICPR)

Via

Access Paper or Ask Questions

DistFL: Distribution-aware Federated Learning for Mobile Scenarios

Oct 22, 2021

Bingyan Liu, Yifeng Cai, Ziqi Zhang, Yuanchun Li, Leye Wang, Ding Li, Yao Guo, Xiangqun Chen

Figure 1 for DistFL: Distribution-aware Federated Learning for Mobile Scenarios

Figure 2 for DistFL: Distribution-aware Federated Learning for Mobile Scenarios

Figure 3 for DistFL: Distribution-aware Federated Learning for Mobile Scenarios

Figure 4 for DistFL: Distribution-aware Federated Learning for Mobile Scenarios

Abstract:Federated learning (FL) has emerged as an effective solution to decentralized and privacy-preserving machine learning for mobile clients. While traditional FL has demonstrated its superiority, it ignores the non-iid (independently identically distributed) situation, which widely exists in mobile scenarios. Failing to handle non-iid situations could cause problems such as performance decreasing and possible attacks. Previous studies focus on the "symptoms" directly, as they try to improve the accuracy or detect possible attacks by adding extra steps to conventional FL models. However, previous techniques overlook the root causes for the "symptoms": blindly aggregating models with the non-iid distributions. In this paper, we try to fundamentally address the issue by decomposing the overall non-iid situation into several iid clusters and conducting aggregation in each cluster. Specifically, we propose \textbf{DistFL}, a novel framework to achieve automated and accurate \textbf{Dist}ribution-aware \textbf{F}ederated \textbf{L}earning in a cost-efficient way. DistFL achieves clustering via extracting and comparing the \textit{distribution knowledge} from the uploaded models. With this framework, we are able to generate multiple personalized models with distinctive distributions and assign them to the corresponding clients. Extensive experiments on mobile scenarios with popular model architectures have demonstrated the effectiveness of DistFL.

* This paper has been accepted by IMWUT2021(Ubicomp)

Via

Access Paper or Ask Questions

Learning Heatmap-Style Jigsaw Puzzles Provides Good Pretraining for 2D Human Pose Estimation

Dec 13, 2020

Kun Zhang, Rui Wu, Ping Yao, Kai Deng, Ding Li, Renbiao Liu, Chuanguang Yang, Ge Chen, Min Du, Tianyao Zheng

Figure 1 for Learning Heatmap-Style Jigsaw Puzzles Provides Good Pretraining for 2D Human Pose Estimation

Figure 2 for Learning Heatmap-Style Jigsaw Puzzles Provides Good Pretraining for 2D Human Pose Estimation

Figure 3 for Learning Heatmap-Style Jigsaw Puzzles Provides Good Pretraining for 2D Human Pose Estimation

Figure 4 for Learning Heatmap-Style Jigsaw Puzzles Provides Good Pretraining for 2D Human Pose Estimation

Abstract:The target of 2D human pose estimation is to locate the keypoints of body parts from input 2D images. State-of-the-art methods for pose estimation usually construct pixel-wise heatmaps from keypoints as labels for learning convolution neural networks, which are usually initialized randomly or using classification models on ImageNet as their backbones. We note that 2D pose estimation task is highly dependent on the contextual relationship between image patches, thus we introduce a self-supervised method for pretraining 2D pose estimation networks. Specifically, we propose Heatmap-Style Jigsaw Puzzles (HSJP) problem as our pretext-task, whose target is to learn the location of each patch from an image composed of shuffled patches. During our pretraining process, we only use images of person instances in MS-COCO, rather than introducing extra and much larger ImageNet dataset. A heatmap-style label for patch location is designed and our learning process is in a non-contrastive way. The weights learned by HSJP pretext task are utilised as backbones of 2D human pose estimator, which are then finetuned on MS-COCO human keypoints dataset. With two popular and strong 2D human pose estimators, HRNet and SimpleBaseline, we evaluate mAP score on both MS-COCO validation and test-dev datasets. Our experiments show that downstream pose estimators with our self-supervised pretraining obtain much better performance than those trained from scratch, and are comparable to those using ImageNet classification models as their initial backbones.

Via

Access Paper or Ask Questions

SIGL: Securing Software Installations Through Deep Graph Learning

Aug 26, 2020

Xueyuan Han, Xiao Yu, Thomas Pasquier, Ding Li, Junghwan Rhee, James Mickens, Margo Seltzer, Haifeng Chen

Figure 1 for SIGL: Securing Software Installations Through Deep Graph Learning

Figure 2 for SIGL: Securing Software Installations Through Deep Graph Learning

Figure 3 for SIGL: Securing Software Installations Through Deep Graph Learning

Figure 4 for SIGL: Securing Software Installations Through Deep Graph Learning

Abstract:Many users implicitly assume that software can only be exploited after it is installed. However, recent supply-chain attacks demonstrate that application integrity must be ensured during installation itself. We introduce SIGL, a new tool for detecting malicious behavior during software installation. SIGL collects traces of system call activity, building a data provenance graph that it analyzes using a novel autoencoder architecture with a graph long short-term memory network (graph LSTM) for the encoder and a standard multilayer perceptron for the decoder. SIGL flags suspicious installations as well as the specific installation-time processes that are likely to be malicious. Using a test corpus of 625 malicious installers containing real-world malware, we demonstrate that SIGL has a detection accuracy of 96%, outperforming similar systems from industry and academia by up to 87% in precision and recall and 45% in accuracy. We also demonstrate that SIGL can pinpoint the processes most likely to have triggered malicious behavior, works on different audit platforms and operating systems, and is robust to training data contamination and adversarial attack. It can be used with application-specific models, even in the presence of new software versions, as well as application-agnostic meta-models that encompass a wide range of applications and installers.

* 18 pages, to appear in the 30th USENIX Security Symposium (USENIX Security '21)

Via

Access Paper or Ask Questions

Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs

May 25, 2020

Lei Cai, Zhengzhang Chen, Chen Luo, Jiaping Gui, Jingchao Ni, Ding Li, Haifeng Chen

Figure 1 for Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs

Figure 2 for Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs

Figure 3 for Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs

Figure 4 for Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs

Abstract:Detecting anomalies in dynamic graphs is a vital task, with numerous practical applications in areas such as security, finance, and social media. Previous network embedding based methods have been mostly focusing on learning good node representations, whereas largely ignoring the subgraph structural changes related to the target nodes in dynamic graphs. In this paper, we propose StrGNN, an end-to-end structural temporal Graph Neural Network model for detecting anomalous edges in dynamic graphs. In particular, we first extract the $h$-hop enclosing subgraph centered on the target edge and propose the node labeling function to identify the role of each node in the subgraph. Then, we leverage graph convolution operation and Sortpooling layer to extract the fixed-size feature from each snapshot/timestamp. Based on the extracted features, we utilize Gated recurrent units (GRUs) to capture the temporal information for anomaly detection. Extensive experiments on six benchmark datasets and a real enterprise security system demonstrate the effectiveness of StrGNN.

Via

Access Paper or Ask Questions

Heterogeneous Graph Matching Networks

Oct 17, 2019

Shen Wang, Zhengzhang Chen, Xiao Yu, Ding Li, Jingchao Ni, Lu-An Tang, Jiaping Gui, Zhichun Li, Haifeng Chen, Philip S. Yu

Figure 1 for Heterogeneous Graph Matching Networks

Figure 2 for Heterogeneous Graph Matching Networks

Figure 3 for Heterogeneous Graph Matching Networks

Figure 4 for Heterogeneous Graph Matching Networks

Abstract:Information systems have widely been the target of malware attacks. Traditional signature-based malicious program detection algorithms can only detect known malware and are prone to evasion techniques such as binary obfuscation, while behavior-based approaches highly rely on the malware training samples and incur prohibitively high training cost. To address the limitations of existing techniques, we propose MatchGNet, a heterogeneous Graph Matching Network model to learn the graph representation and similarity metric simultaneously based on the invariant graph modeling of the program's execution behaviors. We conduct a systematic evaluation of our model and show that it is accurate in detecting malicious program behavior and can help detect malware attacks with less false positives. MatchGNet outperforms the state-of-the-art algorithms in malware detection by generating 50% less false positives while keeping zero false negatives.

Via

Access Paper or Ask Questions