Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yu Zheng

Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values

Jan 11, 2024

Yu Zheng, Huan Yee Koh, Ming Jin, Lianhua Chi, Haishuai Wang, Khoa T. Phan, Yi-Ping Phoebe Chen, Shirui Pan, Wei Xiang

Figure 1 for Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values

Figure 2 for Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values

Figure 3 for Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values

Figure 4 for Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values

Abstract:The detection of anomalies in multivariate time series data is crucial for various practical applications, including smart power grids, traffic flow forecasting, and industrial process control. However, real-world time series data is usually not well-structured, posting significant challenges to existing approaches: (1) The existence of missing values in multivariate time series data along variable and time dimensions hinders the effective modeling of interwoven spatial and temporal dependencies, resulting in important patterns being overlooked during model training; (2) Anomaly scoring with irregularly-sampled observations is less explored, making it difficult to use existing detectors for multivariate series without fully-observed values. In this work, we introduce a novel framework called GST-Pro, which utilizes a graph spatiotemporal process and anomaly scorer to tackle the aforementioned challenges in detecting anomalies on irregularly-sampled multivariate time series. Our approach comprises two main components. First, we propose a graph spatiotemporal process based on neural controlled differential equations. This process enables effective modeling of multivariate time series from both spatial and temporal perspectives, even when the data contains missing values. Second, we present a novel distribution-based anomaly scoring mechanism that alleviates the reliance on complete uniform observations. By analyzing the predictions of the graph spatiotemporal process, our approach allows anomalies to be easily detected. Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods, regardless of whether there are missing values present in the data. Our code is available: https://github.com/huankoh/GST-Pro.

* Accepted by Information Fusion

Via

Access Paper or Ask Questions

FlexSSL : A Generic and Efficient Framework for Semi-Supervised Learning

Dec 28, 2023

Huiling Qin, Xianyuan Zhan, Yuanxun Li, Yu Zheng

Figure 1 for FlexSSL : A Generic and Efficient Framework for Semi-Supervised Learning

Figure 2 for FlexSSL : A Generic and Efficient Framework for Semi-Supervised Learning

Figure 3 for FlexSSL : A Generic and Efficient Framework for Semi-Supervised Learning

Figure 4 for FlexSSL : A Generic and Efficient Framework for Semi-Supervised Learning

Abstract:Semi-supervised learning holds great promise for many real-world applications, due to its ability to leverage both unlabeled and expensive labeled data. However, most semi-supervised learning algorithms still heavily rely on the limited labeled data to infer and utilize the hidden information from unlabeled data. We note that any semi-supervised learning task under the self-training paradigm also hides an auxiliary task of discriminating label observability. Jointly solving these two tasks allows full utilization of information from both labeled and unlabeled data, thus alleviating the problem of over-reliance on labeled data. This naturally leads to a new generic and efficient learning framework without the reliance on any domain-specific information, which we call FlexSSL. The key idea of FlexSSL is to construct a semi-cooperative "game", which forges cooperation between a main self-interested semi-supervised learning task and a companion task that infers label observability to facilitate main task training. We show with theoretical derivation of its connection to loss re-weighting on noisy labels. Through evaluations on a diverse range of tasks, we demonstrate that FlexSSL can consistently enhance the performance of semi-supervised learning algorithms.

Via

Access Paper or Ask Questions

Efficient Large Language Models: A Survey

Dec 23, 2023

Zhongwei Wan, Xin Wang, Che Liu, Samiul Alam, Yu Zheng, Jiachen Liu, Zhongnan Qu, Shen Yan, Yi Zhu, Quanlu Zhang(+2 more)

Figure 1 for Efficient Large Language Models: A Survey

Figure 2 for Efficient Large Language Models: A Survey

Figure 3 for Efficient Large Language Models: A Survey

Figure 4 for Efficient Large Language Models: A Survey

Abstract:Large Language Models (LLMs) have demonstrated remarkable capabilities in important tasks such as natural language understanding, language generation, and complex reasoning and have the potential to make a substantial impact on our society. Such capabilities, however, come with the considerable resources they demand, highlighting the strong need to develop effective techniques for addressing their efficiency challenges. In this survey, we provide a systematic and comprehensive review of efficient LLMs research. We organize the literature in a taxonomy consisting of three main categories, covering distinct yet interconnected efficient LLMs topics from model-centric, data-centric, and framework-centric perspective, respectively. We have also created a GitHub repository where we compile the papers featured in this survey at https://github.com/AIoT-MLSys-Lab/EfficientLLMs, and will actively maintain this repository and incorporate new research as it emerges. We hope our survey can serve as a valuable resource to help researchers and practitioners gain a systematic understanding of the research developments in efficient LLMs and inspire them to contribute to this important and exciting field.

* Version 2

Via

Access Paper or Ask Questions

Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation

Nov 29, 2023

Yu Zheng, Guangming Wang, Jiuming Liu, Marc Pollefeys, Hesheng Wang

Figure 1 for Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation

Figure 2 for Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation

Figure 3 for Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation

Figure 4 for Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation

Abstract:LiDAR point cloud semantic segmentation enables the robots to obtain fine-grained semantic information of the surrounding environment. Recently, many works project the point cloud onto the 2D image and adopt the 2D Convolutional Neural Networks (CNNs) or vision transformer for LiDAR point cloud semantic segmentation. However, since more than one point can be projected onto the same 2D position but only one point can be preserved, the previous 2D image-based segmentation methods suffer from inevitable quantized information loss. To avoid quantized information loss, in this paper, we propose a novel spherical frustum structure. The points projected onto the same 2D position are preserved in the spherical frustums. Moreover, we propose a memory-efficient hash-based representation of spherical frustums. Through the hash-based representation, we propose the Spherical Frustum sparse Convolution (SFC) and Frustum Fast Point Sampling (F2PS) to convolve and sample the points stored in spherical frustums respectively. Finally, we present the Spherical Frustum sparse Convolution Network (SFCNet) to adopt 2D CNNs for LiDAR point cloud semantic segmentation without quantized information loss. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that our SFCNet outperforms the 2D image-based semantic segmentation methods based on conventional spherical projection. The source code will be released later.

* 17 pages, 10 figures, under review

Via

Access Paper or Ask Questions

Inverse Learning with Extremely Sparse Feedback for Recommendation

Nov 20, 2023

Guanyu Lin, Chen Gao, Yu Zheng, Yinfeng Li, Jianxin Chang, Yanan Niu, Yang Song, Kun Gai, Zhiheng Li, Depeng Jin(+1 more)

Figure 1 for Inverse Learning with Extremely Sparse Feedback for Recommendation

Figure 2 for Inverse Learning with Extremely Sparse Feedback for Recommendation

Figure 3 for Inverse Learning with Extremely Sparse Feedback for Recommendation

Figure 4 for Inverse Learning with Extremely Sparse Feedback for Recommendation

Abstract:Modern personalized recommendation services often rely on user feedback, either explicit or implicit, to improve the quality of services. Explicit feedback refers to behaviors like ratings, while implicit feedback refers to behaviors like user clicks. However, in the scenario of full-screen video viewing experiences like Tiktok and Reels, the click action is absent, resulting in unclear feedback from users, hence introducing noises in modeling training. Existing approaches on de-noising recommendation mainly focus on positive instances while ignoring the noise in a large amount of sampled negative feedback. In this paper, we propose a meta-learning method to annotate the unlabeled data from loss and gradient perspectives, which considers the noises in both positive and negative instances. Specifically, we first propose an Inverse Dual Loss (IDL) to boost the true label learning and prevent the false label learning. Then we further propose an Inverse Gradient (IG) method to explore the correct updating gradient and adjust the updating based on meta-learning. Finally, we conduct extensive experiments on both benchmark and industrial datasets where our proposed method can significantly improve AUC by 9.25% against state-of-the-art methods. Further analysis verifies the proposed inverse learning framework is model-agnostic and can improve a variety of recommendation backbones. The source code, along with the best hyper-parameter settings, is available at this link: https://github.com/Guanyu-Lin/InverseLearning.

* WSDM 2024

Via

Access Paper or Ask Questions

Mixed Attention Network for Cross-domain Sequential Recommendation

Nov 14, 2023

Guanyu Lin, Chen Gao, Yu Zheng, Jianxin Chang, Yanan Niu, Yang Song, Kun Gai, Zhiheng Li, Depeng Jin, Yong Li(+1 more)

Figure 1 for Mixed Attention Network for Cross-domain Sequential Recommendation

Figure 2 for Mixed Attention Network for Cross-domain Sequential Recommendation

Figure 3 for Mixed Attention Network for Cross-domain Sequential Recommendation

Figure 4 for Mixed Attention Network for Cross-domain Sequential Recommendation

Abstract:In modern recommender systems, sequential recommendation leverages chronological user behaviors to make effective next-item suggestions, which suffers from data sparsity issues, especially for new users. One promising line of work is the cross-domain recommendation, which trains models with data across multiple domains to improve the performance in data-scarce domains. Recent proposed cross-domain sequential recommendation models such as PiNet and DASL have a common drawback relying heavily on overlapped users in different domains, which limits their usage in practical recommender systems. In this paper, we propose a Mixed Attention Network (MAN) with local and global attention modules to extract the domain-specific and cross-domain information. Firstly, we propose a local/global encoding layer to capture the domain-specific/cross-domain sequential pattern. Then we propose a mixed attention layer with item similarity attention, sequence-fusion attention, and group-prototype attention to capture the local/global item similarity, fuse the local/global item sequence, and extract the user groups across different domains, respectively. Finally, we propose a local/global prediction layer to further evolve and combine the domain-specific and cross-domain interests. Experimental results on two real-world datasets (each with two domains) demonstrate the superiority of our proposed model. Further study also illustrates that our proposed method and components are model-agnostic and effective, respectively. The code and data are available at https://github.com/Guanyu-Lin/MAN.

* WSDM 2024

Via

Access Paper or Ask Questions

Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook

Oct 20, 2023

Ming Jin, Qingsong Wen, Yuxuan Liang, Chaoli Zhang, Siqiao Xue, Xue Wang, James Zhang, Yi Wang, Haifeng Chen, Xiaoli Li(+5 more)

Figure 1 for Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook

Figure 2 for Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook

Figure 3 for Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook

Figure 4 for Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook

Abstract:Temporal data, notably time series and spatio-temporal data, are prevalent in real-world applications. They capture dynamic system measurements and are produced in vast quantities by both physical and virtual sensors. Analyzing these data types is vital to harnessing the rich information they encompass and thus benefits a wide range of downstream tasks. Recent advances in large language and other foundational models have spurred increased use of these models in time series and spatio-temporal data mining. Such methodologies not only enable enhanced pattern recognition and reasoning across diverse domains but also lay the groundwork for artificial general intelligence capable of comprehending and processing common temporal data. In this survey, we offer a comprehensive and up-to-date review of large models tailored (or adapted) for time series and spatio-temporal data, spanning four key facets: data types, model categories, model scopes, and application areas/tasks. Our objective is to equip practitioners with the knowledge to develop applications and further research in this underexplored domain. We primarily categorize the existing literature into two major clusters: large models for time series analysis (LM4TS) and spatio-temporal data mining (LM4STD). On this basis, we further classify research based on model scopes (i.e., general vs. domain-specific) and application areas/tasks. We also provide a comprehensive collection of pertinent resources, including datasets, model assets, and useful tools, categorized by mainstream applications. This survey coalesces the latest strides in large model-centric research on time series and spatio-temporal data, underscoring the solid foundations, current advances, practical applications, abundant resources, and future research opportunities.

* Ongoing work; 24 pages, 3 figures, 3 tables; Github page: https://github.com/qingsongedu/Awesome-TimeSeries-SpatioTemporal-LM-LLM

Via

Access Paper or Ask Questions

FedFed: Feature Distillation against Data Heterogeneity in Federated Learning

Oct 08, 2023

Zhiqin Yang, Yonggang Zhang, Yu Zheng, Xinmei Tian, Hao Peng, Tongliang Liu, Bo Han

Figure 1 for FedFed: Feature Distillation against Data Heterogeneity in Federated Learning

Figure 2 for FedFed: Feature Distillation against Data Heterogeneity in Federated Learning

Figure 3 for FedFed: Feature Distillation against Data Heterogeneity in Federated Learning

Figure 4 for FedFed: Feature Distillation against Data Heterogeneity in Federated Learning

Abstract:Federated learning (FL) typically faces data heterogeneity, i.e., distribution shifting among clients. Sharing clients' information has shown great potentiality in mitigating data heterogeneity, yet incurs a dilemma in preserving privacy and promoting model performance. To alleviate the dilemma, we raise a fundamental question: \textit{Is it possible to share partial features in the data to tackle data heterogeneity?} In this work, we give an affirmative answer to this question by proposing a novel approach called {\textbf{Fed}erated \textbf{Fe}ature \textbf{d}istillation} (FedFed). Specifically, FedFed partitions data into performance-sensitive features (i.e., greatly contributing to model performance) and performance-robust features (i.e., limitedly contributing to model performance). The performance-sensitive features are globally shared to mitigate data heterogeneity, while the performance-robust features are kept locally. FedFed enables clients to train models over local and shared data. Comprehensive experiments demonstrate the efficacy of FedFed in promoting model performance.

* NeurIPS 2023
* 32 pages

Via

Access Paper or Ask Questions

Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd Flow Inference

Sep 12, 2023

Songyu Ke, Ting Li, Li Song, Yanping Sun, Qintian Sun, Junbo Zhang, Yu Zheng

Figure 1 for Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd Flow Inference

Figure 2 for Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd Flow Inference

Figure 3 for Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd Flow Inference

Figure 4 for Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd Flow Inference

Abstract:Accurate acquisition of crowd flow at Points of Interest (POIs) is pivotal for effective traffic management, public service, and urban planning. Despite this importance, due to the limitations of urban sensing techniques, the data quality from most sources is inadequate for monitoring crowd flow at each POI. This renders the inference of accurate crowd flow from low-quality data a critical and challenging task. The complexity is heightened by three key factors: 1) The scarcity and rarity of labeled data, 2) The intricate spatio-temporal dependencies among POIs, and 3) The myriad correlations between precise crowd flow and GPS reports. To address these challenges, we recast the crowd flow inference problem as a self-supervised attributed graph representation learning task and introduce a novel Contrastive Self-learning framework for Spatio-Temporal data (CSST). Our approach initiates with the construction of a spatial adjacency graph founded on the POIs and their respective distances. We then employ a contrastive learning technique to exploit large volumes of unlabeled spatio-temporal data. We adopt a swapped prediction approach to anticipate the representation of the target subgraph from similar instances. Following the pre-training phase, the model is fine-tuned with accurate crowd flow data. Our experiments, conducted on two real-world datasets, demonstrate that the CSST pre-trained on extensive noisy data consistently outperforms models trained from scratch.

* 18 pages; submitted to TKDD;

Via

Access Paper or Ask Questions

UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

Aug 24, 2023

Yu Zheng, Yajun Zhang, Chuanying Niu, Yibin Zhan, Yanhua Long, Dongxing Xu

Figure 1 for UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

Figure 2 for UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

Figure 3 for UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

Figure 4 for UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

Abstract:This report describes the UNISOUND submission for Track1 and Track2 of VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC 2023). We submit the same system on Track 1 and Track 2, which is trained with only VoxCeleb2-dev. Large-scale ResNet and RepVGG architectures are developed for the challenge. We propose a consistency-aware score calibration method, which leverages the stability of audio voiceprints in similarity score by a Consistency Measure Factor (CMF). CMF brings a huge performance boost in this challenge. Our final system is a fusion of six models and achieves the first place in Track 1 and second place in Track 2 of VoxSRC 2023. The minDCF of our submission is 0.0855 and the EER is 1.5880%.

Via

Access Paper or Ask Questions