Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yi Xie

Enhancing Masked Time-Series Modeling via Dropping Patches

Dec 19, 2024

Tianyu Qiu, Yi Xie, Yun Xiong, Hao Niu, Xiaofeng Gao

Figure 1 for Enhancing Masked Time-Series Modeling via Dropping Patches

Figure 2 for Enhancing Masked Time-Series Modeling via Dropping Patches

Figure 3 for Enhancing Masked Time-Series Modeling via Dropping Patches

Figure 4 for Enhancing Masked Time-Series Modeling via Dropping Patches

Abstract:This paper explores how to enhance existing masked time-series modeling by randomly dropping sub-sequence level patches of time series. On this basis, a simple yet effective method named DropPatch is proposed, which has two remarkable advantages: 1) It improves the pre-training efficiency by a square-level advantage; 2) It provides additional advantages for modeling in scenarios such as in-domain, cross-domain, few-shot learning and cold start. This paper conducts comprehensive experiments to verify the effectiveness of the method and analyze its internal mechanism. Empirically, DropPatch strengthens the attention mechanism, reduces information redundancy and serves as an efficient means of data augmentation. Theoretically, it is proved that DropPatch slows down the rate at which the Transformer representations collapse into the rank-1 linear subspace by randomly dropping patches, thus optimizing the quality of the learned representations

Via

Access Paper or Ask Questions

TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views

Dec 13, 2024

Liang Zhao, Zehan Bao, Yi Xie, Hong Chen, Yaohui Chen, Weifu Li

Figure 1 for TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views

Figure 2 for TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views

Figure 3 for TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views

Figure 4 for TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views

Abstract:Recent advances in Gaussian Splatting have significantly advanced the field, achieving both panoptic and interactive segmentation of 3D scenes. However, existing methodologies often overlook the critical need for reconstructing specified targets with complex structures from sparse views. To address this issue, we introduce TSGaussian, a novel framework that combines semantic constraints with depth priors to avoid geometry degradation in challenging novel view synthesis tasks. Our approach prioritizes computational resources on designated targets while minimizing background allocation. Bounding boxes from YOLOv9 serve as prompts for Segment Anything Model to generate 2D mask predictions, ensuring semantic accuracy and cost efficiency. TSGaussian effectively clusters 3D gaussians by introducing a compact identity encoding for each Gaussian ellipsoid and incorporating 3D spatial consistency regularization. Leveraging these modules, we propose a pruning strategy to effectively reduce redundancy in 3D gaussians. Extensive experiments demonstrate that TSGaussian outperforms state-of-the-art methods on three standard datasets and a new challenging dataset we collected, achieving superior results in novel view synthesis of specific objects. Code is available at: https://github.com/leon2000-ai/TSGaussian.

Via

Access Paper or Ask Questions

Look a Group at Once: Multi-Slide Modeling for Survival Prediction

Nov 18, 2024

Xinyang Li, Yi Zhang, Yi Xie, Jianfei Yang, Xi Wang, Hao Chen, Haixian Zhang

Figure 1 for Look a Group at Once: Multi-Slide Modeling for Survival Prediction

Figure 2 for Look a Group at Once: Multi-Slide Modeling for Survival Prediction

Figure 3 for Look a Group at Once: Multi-Slide Modeling for Survival Prediction

Figure 4 for Look a Group at Once: Multi-Slide Modeling for Survival Prediction

Abstract:Survival prediction is a critical task in pathology. In clinical practice, pathologists often examine multiple cases, leveraging a broader spectrum of cancer phenotypes to enhance pathological assessment. Despite significant advancements in deep learning, current solutions typically model each slide as a sample, struggling to effectively capture comparable and slide-agnostic pathological features. In this paper, we introduce GroupMIL, a novel framework inspired by the clinical practice of collective analysis, which models multiple slides as a single sample and organizes groups of patches and slides sequentially to capture cross-slide prognostic features. We also present GPAMamba, a model designed to facilitate intra- and inter-slide feature interactions, effectively capturing local micro-environmental characteristics within slide-level graphs while uncovering essential prognostic patterns across an extended patch sequence within the group framework. Furthermore, we develop a dual-head predictor that delivers comprehensive survival risk and probability assessments for each patient. Extensive empirical evaluations demonstrate that our model significantly outperforms state-of-the-art approaches across five datasets from The Cancer Genome Atlas.

Via

Access Paper or Ask Questions

Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement

Oct 17, 2024

Chuhao Zhou, Chenxi Jiang, Yi Xie, Haozhi Cao, Jianfei Yang

Figure 1 for Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement

Figure 2 for Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement

Figure 3 for Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement

Figure 4 for Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement

Abstract:Dataset Distillation (DD) seeks to create a condensed dataset that, when used to train a model, enables the model to achieve performance similar to that of a model trained on the entire original dataset. It relieves the model training from processing massive data and thus reduces the computation resources, storage, and time costs. This paper illustrates our solution that ranks 1st in the ECCV-2024 Data Distillation Challenge (track 1). Our solution, Modified Difficulty-Aligned Trajectory Matching (M-DATM), introduces two key modifications to the original state-of-the-art method DATM: (1) the soft labels learned by DATM do not achieve one-to-one correspondence with the counterparts generated by the official evaluation script, so we remove the soft labels technique to alleviate such inconsistency; (2) since the removal of soft labels makes it harder for the synthetic dataset to learn late trajectory information, particularly on Tiny ImageNet, we reduce the matching range, allowing the synthetic data to concentrate more on the easier patterns. In the final evaluation, our M-DATM achieved accuracies of 0.4061 and 0.1831 on the CIFAR-100 and Tiny ImageNet datasets, ranking 1st in the Fixed Images Per Class (IPC) Track.

* ECCV 2024 Dataset Distillation Challenge

Via

Access Paper or Ask Questions

MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Sep 30, 2024

Zeyu Zhang, Quanyu Dai, Luyu Chen, Zeren Jiang, Rui Li, Jieming Zhu, Xu Chen, Yi Xie, Zhenhua Dong, Ji-Rong Wen

Figure 1 for MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Figure 2 for MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Figure 3 for MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Figure 4 for MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Abstract:LLM-based agents have been widely applied as personal assistants, capable of memorizing information from user messages and responding to personal queries. However, there still lacks an objective and automatic evaluation on their memory capability, largely due to the challenges in constructing reliable questions and answers (QAs) according to user messages. In this paper, we propose MemSim, a Bayesian simulator designed to automatically construct reliable QAs from generated user messages, simultaneously keeping their diversity and scalability. Specifically, we introduce the Bayesian Relation Network (BRNet) and a causal generation mechanism to mitigate the impact of LLM hallucinations on factual information, facilitating the automatic creation of an evaluation dataset. Based on MemSim, we generate a dataset in the daily-life scenario, named MemDaily, and conduct extensive experiments to assess the effectiveness of our approach. We also provide a benchmark for evaluating different memory mechanisms in LLM-based agents with the MemDaily dataset. To benefit the research community, we have released our project at https://github.com/nuster1128/MemSim.

* 26 pages, 25 tables, 1 figure

Via

Access Paper or Ask Questions

MSSDA: Multi-Sub-Source Adaptation for Diabetic Foot Neuropathy Recognition

Sep 21, 2024

Yan Zhong, Zhixin Yan, Yi Xie, Shibin Wu, Huaidong Zhang, Lin Shu, Peiru Zhou

Figure 1 for MSSDA: Multi-Sub-Source Adaptation for Diabetic Foot Neuropathy Recognition

Figure 2 for MSSDA: Multi-Sub-Source Adaptation for Diabetic Foot Neuropathy Recognition

Figure 3 for MSSDA: Multi-Sub-Source Adaptation for Diabetic Foot Neuropathy Recognition

Figure 4 for MSSDA: Multi-Sub-Source Adaptation for Diabetic Foot Neuropathy Recognition

Abstract:Diabetic foot neuropathy (DFN) is a critical factor leading to diabetic foot ulcers, which is one of the most common and severe complications of diabetes mellitus (DM) and is associated with high risks of amputation and mortality. Despite its significance, existing datasets do not directly derive from plantar data and lack continuous, long-term foot-specific information. To advance DFN research, we have collected a novel dataset comprising continuous plantar pressure data to recognize diabetic foot neuropathy. This dataset includes data from 94 DM patients with DFN and 41 DM patients without DFN. Moreover, traditional methods divide datasets by individuals, potentially leading to significant domain discrepancies in some feature spaces due to the absence of mid-domain data. In this paper, we propose an effective domain adaptation method to address this proplem. We split the dataset based on convolutional feature statistics and select appropriate sub-source domains to enhance efficiency and avoid negative transfer. We then align the distributions of each source and target domain pair in specific feature spaces to minimize the domain gap. Comprehensive results validate the effectiveness of our method on both the newly proposed dataset for DFN recognition and an existing dataset.

Via

Access Paper or Ask Questions

An Efficient and Generalizable Symbolic Regression Method for Time Series Analysis

Sep 06, 2024

Yi Xie, Tianyu Qiu, Yun Xiong, Xiuqi Huang, Xiaofeng Gao, Chao Chen

Figure 1 for An Efficient and Generalizable Symbolic Regression Method for Time Series Analysis

Figure 2 for An Efficient and Generalizable Symbolic Regression Method for Time Series Analysis

Figure 3 for An Efficient and Generalizable Symbolic Regression Method for Time Series Analysis

Figure 4 for An Efficient and Generalizable Symbolic Regression Method for Time Series Analysis

Abstract:Time series analysis and prediction methods currently excel in quantitative analysis, offering accurate future predictions and diverse statistical indicators, but generally falling short in elucidating the underlying evolution patterns of time series. To gain a more comprehensive understanding and provide insightful explanations, we utilize symbolic regression techniques to derive explicit expressions for the non-linear dynamics in the evolution of time series variables. However, these techniques face challenges in computational efficiency and generalizability across diverse real-world time series data. To overcome these challenges, we propose \textbf{N}eural-\textbf{E}nhanced \textbf{Mo}nte-Carlo \textbf{T}ree \textbf{S}earch (NEMoTS) for time series. NEMoTS leverages the exploration-exploitation balance of Monte-Carlo Tree Search (MCTS), significantly reducing the search space in symbolic regression and improving expression quality. Furthermore, by integrating neural networks with MCTS, NEMoTS not only capitalizes on their superior fitting capabilities to concentrate on more pertinent operations post-search space reduction, but also replaces the complex and time-consuming simulation process, thereby substantially improving computational efficiency and generalizability in time series analysis. NEMoTS offers an efficient and comprehensive approach to time series analysis. Experiments with three real-world datasets demonstrate NEMoTS's significant superiority in performance, efficiency, reliability, and interpretability, making it well-suited for large-scale real-world time series data.

Via

Access Paper or Ask Questions

Inference Performance Optimization for Large Language Models on CPUs

Jul 10, 2024

Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie

Figure 1 for Inference Performance Optimization for Large Language Models on CPUs

Figure 2 for Inference Performance Optimization for Large Language Models on CPUs

Figure 3 for Inference Performance Optimization for Large Language Models on CPUs

Figure 4 for Inference Performance Optimization for Large Language Models on CPUs

Abstract:Large language models (LLMs) have shown exceptional performance and vast potential across diverse tasks. However, the deployment of LLMs with high performance in low-resource environments has garnered significant attention in the industry. When GPU hardware resources are limited, we can explore alternative options on CPUs. To mitigate the financial burden and alleviate constraints imposed by hardware resources, optimizing inference performance is necessary. In this paper, we introduce an easily deployable inference performance optimization solution aimed at accelerating LLMs on CPUs. In this solution, we implement an effective way to reduce the KV cache size while ensuring precision. We propose a distributed inference optimization approach and implement it based on oneAPI Collective Communications Library. Furthermore, we propose optimization approaches for LLMs on CPU, and conduct tailored optimizations for the most commonly used models. The code is open-sourced at https://github.com/intel/xFasterTransformer.

* 5 pages, 6 figure, ICML 2024 on Foundation Models in the Wild

Via

Access Paper or Ask Questions

AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

May 08, 2024

Yongheng Zhang, Tingwen Du, Yunshan Ma, Xiang Wang, Yi Xie, Guozheng Yang, Yuliang Lu, Ee-Chien Chang

Figure 1 for AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

Figure 2 for AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

Figure 3 for AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

Figure 4 for AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

Abstract:Attack knowledge graph construction seeks to convert textual cyber threat intelligence (CTI) reports into structured representations, portraying the evolutionary traces of cyber attacks. Even though previous research has proposed various methods to construct attack knowledge graphs, they generally suffer from limited generalization capability to diverse knowledge types as well as requirement of expertise in model design and tuning. Addressing these limitations, we seek to utilize Large Language Models (LLMs), which have achieved enormous success in a broad range of tasks given exceptional capabilities in both language understanding and zero-shot task fulfillment. Thus, we propose a fully automatic LLM-based framework to construct attack knowledge graphs named: AttacKG+. Our framework consists of four consecutive modules: rewriter, parser, identifier, and summarizer, each of which is implemented by instruction prompting and in-context learning empowered by LLMs. Furthermore, we upgrade the existing attack knowledge schema and propose a comprehensive version. We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation, including behavior graph, MITRE TTP labels, and state summary. Extensive evaluation demonstrates that: 1) our formulation seamlessly satisfies the information needs in threat event analysis, 2) our construction framework is effective in faithfully and accurately extracting the information defined by AttacKG+, and 3) our attack graph directly benefits downstream security practices such as attack reconstruction. All the code and datasets will be released upon acceptance.

* 20 pages, 5 figures

Via

Access Paper or Ask Questions

Beimingwu: A Learnware Dock System

Jan 24, 2024

Zhi-Hao Tan, Jian-Dong Liu, Xiao-Dong Bi, Peng Tan, Qin-Cheng Zheng, Hai-Tian Liu, Yi Xie, Xiao-Chuan Zou, Yang Yu, Zhi-Hua Zhou

Figure 1 for Beimingwu: A Learnware Dock System

Figure 2 for Beimingwu: A Learnware Dock System

Figure 3 for Beimingwu: A Learnware Dock System

Figure 4 for Beimingwu: A Learnware Dock System

Abstract:The learnware paradigm proposed by Zhou [2016] aims to enable users to reuse numerous existing well-trained models instead of building machine learning models from scratch, with the hope of solving new user tasks even beyond models' original purposes. In this paradigm, developers worldwide can submit their high-performing models spontaneously to the learnware dock system (formerly known as learnware market) without revealing their training data. Once the dock system accepts the model, it assigns a specification and accommodates the model. This specification allows the model to be adequately identified and assembled to reuse according to future users' needs, even if they have no prior knowledge of the model. This paradigm greatly differs from the current big model direction and it is expected that a learnware dock system housing millions or more high-performing models could offer excellent capabilities for both planned tasks where big models are applicable; and unplanned, specialized, data-sensitive scenarios where big models are not present or applicable. This paper describes Beimingwu, the first open-source learnware dock system providing foundational support for future research of learnware paradigm.The system significantly streamlines the model development for new user tasks, thanks to its integrated architecture and engine design, extensive engineering implementations and optimizations, and the integration of various algorithms for learnware identification and reuse. Notably, this is possible even for users with limited data and minimal expertise in machine learning, without compromising the raw data's security. Beimingwu supports the entire process of learnware paradigm. The system lays the foundation for future research in learnware-related algorithms and systems, and prepares the ground for hosting a vast array of learnwares and establishing a learnware ecosystem.

Via

Access Paper or Ask Questions