Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Min Wu

Knowledge-Guided Biomarker Identification for Label-Free Single-Cell RNA-Seq Data: A Reinforcement Learning Perspective

Jan 02, 2025

Meng Xiao, Weiliang Zhang, Xiaohan Huang, Hengshu Zhu, Min Wu, Xiaoli Li, Yuanchun Zhou

Figure 1 for Knowledge-Guided Biomarker Identification for Label-Free Single-Cell RNA-Seq Data: A Reinforcement Learning Perspective

Figure 2 for Knowledge-Guided Biomarker Identification for Label-Free Single-Cell RNA-Seq Data: A Reinforcement Learning Perspective

Figure 3 for Knowledge-Guided Biomarker Identification for Label-Free Single-Cell RNA-Seq Data: A Reinforcement Learning Perspective

Figure 4 for Knowledge-Guided Biomarker Identification for Label-Free Single-Cell RNA-Seq Data: A Reinforcement Learning Perspective

Abstract:Gene panel selection aims to identify the most informative genomic biomarkers in label-free genomic datasets. Traditional approaches, which rely on domain expertise, embedded machine learning models, or heuristic-based iterative optimization, often introduce biases and inefficiencies, potentially obscuring critical biological signals. To address these challenges, we present an iterative gene panel selection strategy that harnesses ensemble knowledge from existing gene selection algorithms to establish preliminary boundaries or prior knowledge, which guide the initial search space. Subsequently, we incorporate reinforcement learning through a reward function shaped by expert behavior, enabling dynamic refinement and targeted selection of gene panels. This integration mitigates biases stemming from initial boundaries while capitalizing on RL's stochastic adaptability. Comprehensive comparative experiments, case studies, and downstream analyses demonstrate the effectiveness of our method, highlighting its improved precision and efficiency for label-free biomarker discovery. Our results underscore the potential of this approach to advance single-cell genomics data analysis.

* 20 pages. arXiv admin note: substantial text overlap with arXiv:2406.07418

Via

Access Paper or Ask Questions

Augmented Contrastive Clustering with Uncertainty-Aware Prototyping for Time Series Test Time Adaptation

Jan 01, 2025

Peiliang Gong, Mohamed Ragab, Min Wu, Zhenghua Chen, Yongyi Su, Xiaoli Li, Daoqiang Zhang

Figure 1 for Augmented Contrastive Clustering with Uncertainty-Aware Prototyping for Time Series Test Time Adaptation

Figure 2 for Augmented Contrastive Clustering with Uncertainty-Aware Prototyping for Time Series Test Time Adaptation

Figure 3 for Augmented Contrastive Clustering with Uncertainty-Aware Prototyping for Time Series Test Time Adaptation

Figure 4 for Augmented Contrastive Clustering with Uncertainty-Aware Prototyping for Time Series Test Time Adaptation

Abstract:Test-time adaptation aims to adapt pre-trained deep neural networks using solely online unlabelled test data during inference. Although TTA has shown promise in visual applications, its potential in time series contexts remains largely unexplored. Existing TTA methods, originally designed for visual tasks, may not effectively handle the complex temporal dynamics of real-world time series data, resulting in suboptimal adaptation performance. To address this gap, we propose Augmented Contrastive Clustering with Uncertainty-aware Prototyping (ACCUP), a straightforward yet effective TTA method for time series data. Initially, our approach employs augmentation ensemble on the time series data to capture diverse temporal information and variations, incorporating uncertainty-aware prototypes to distill essential characteristics. Additionally, we introduce an entropy comparison scheme to selectively acquire more confident predictions, enhancing the reliability of pseudo labels. Furthermore, we utilize augmented contrastive clustering to enhance feature discriminability and mitigate error accumulation from noisy pseudo labels, promoting cohesive clustering within the same class while facilitating clear separation between different classes. Extensive experiments conducted on three real-world time series datasets and an additional visual dataset demonstrate the effectiveness and generalization potential of the proposed method, advancing the underexplored realm of TTA for time series data.

Via

Access Paper or Ask Questions

GeneSUM: Large Language Model-based Gene Summary Extraction

Dec 24, 2024

Zhijian Chen, Chuan Hu, Min Wu, Qingqing Long, Xuezhi Wang, Yuanchun Zhou, Meng Xiao

Abstract:Emerging topics in biomedical research are continuously expanding, providing a wealth of information about genes and their function. This rapid proliferation of knowledge presents unprecedented opportunities for scientific discovery and formidable challenges for researchers striving to keep abreast of the latest advancements. One significant challenge is navigating the vast corpus of literature to extract vital gene-related information, a time-consuming and cumbersome task. To enhance the efficiency of this process, it is crucial to address several key challenges: (1) the overwhelming volume of literature, (2) the complexity of gene functions, and (3) the automated integration and generation. In response, we propose GeneSUM, a two-stage automated gene summary extractor utilizing a large language model (LLM). Our approach retrieves and eliminates redundancy of target gene literature and then fine-tunes the LLM to refine and streamline the summarization process. We conducted extensive experiments to validate the efficacy of our proposed framework. The results demonstrate that LLM significantly enhances the integration of gene-specific information, allowing more efficient decision-making in ongoing research.

* 7 pages, Accepted by BIBM 2024

Via

Access Paper or Ask Questions

Surface-Based Authentication System for Integrated Circuit Chips

Dec 19, 2024

Runze Liu, Prasun Datta, Anirudh Nakra, Chau-Wai Wong, Min Wu

Abstract:The rapid development of the semiconductor industry and the ubiquity of electronic devices have led to a significant increase in the counterfeiting of integrated circuits (ICs). This poses a major threat to public health, the banking industry, and military defense sectors that are heavily reliant on electronic systems. The electronic physically unclonable functions (PUFs) are widely used to authenticate IC chips at the unit level. However, electronic PUFs are limited by their requirement for IC chips to be in working status for measurements and their sensitivity to environmental variations. This paper proposes using optical PUFs for IC chip authentication by leveraging the unique microscopic structures of the packaging surface of individual IC chips. The proposed method relies on color images of IC chip surfaces acquired using a flatbed scanner or mobile camera. Our initial study reveals that these consumer-grade imaging devices can capture meaningful physical features from IC chip surfaces. We then propose an efficient, lightweight verification scheme leveraging specular-reflection-based features extracted from videos, achieving an equal error rate (EER) of 0.0008. We conducted factor, sensitivity, and ablation studies to understand the detailed characteristics of the proposed lightweight verification scheme. This work is the first to apply the optical PUF principle for the authentication of IC chips and has the potential to significantly enhance the security of the semiconductor supply chain.

Via

Access Paper or Ask Questions

Towards Open-Vocabulary Video Semantic Segmentation

Dec 12, 2024

Xinhao Li, Yun Liu, Guolei Sun, Min Wu, Le Zhang, Ce Zhu

Figure 1 for Towards Open-Vocabulary Video Semantic Segmentation

Figure 2 for Towards Open-Vocabulary Video Semantic Segmentation

Figure 3 for Towards Open-Vocabulary Video Semantic Segmentation

Figure 4 for Towards Open-Vocabulary Video Semantic Segmentation

Abstract:Semantic segmentation in videos has been a focal point of recent research. However, existing models encounter challenges when faced with unfamiliar categories. To address this, we introduce the Open Vocabulary Video Semantic Segmentation (OV-VSS) task, designed to accurately segment every pixel across a wide range of open-vocabulary categories, including those that are novel or previously unexplored. To enhance OV-VSS performance, we propose a robust baseline, OV2VSS, which integrates a spatial-temporal fusion module, allowing the model to utilize temporal relationships across consecutive frames. Additionally, we incorporate a random frame enhancement module, broadening the model's understanding of semantic context throughout the entire video sequence. Our approach also includes video text encoding, which strengthens the model's capability to interpret textual information within the video context. Comprehensive evaluations on benchmark datasets such as VSPW and Cityscapes highlight OV-VSS's zero-shot generalization capabilities, especially in handling novel categories. The results validate OV2VSS's effectiveness, demonstrating improved performance in semantic segmentation tasks across diverse video datasets.

* 13 pages, 7 figures

Via

Access Paper or Ask Questions

Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation

Oct 29, 2024

Zhaochong An, Guolei Sun, Yun Liu, Runjia Li, Min Wu, Ming-Ming Cheng, Ender Konukoglu, Serge Belongie

Figure 1 for Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation

Figure 2 for Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation

Figure 3 for Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation

Figure 4 for Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation

Abstract:Few-shot 3D point cloud segmentation (FS-PCS) aims at generalizing models to segment novel categories with minimal annotated support samples. While existing FS-PCS methods have shown promise, they primarily focus on unimodal point cloud inputs, overlooking the potential benefits of leveraging multimodal information. In this paper, we address this gap by introducing a cost-free multimodal FS-PCS setup, utilizing textual labels and the potentially available 2D image modality. Under this easy-to-achieve setup, we present the MultiModal Few-Shot SegNet (MM-FSS), a model effectively harnessing complementary information from multiple modalities. MM-FSS employs a shared backbone with two heads to extract intermodal and unimodal visual features, and a pretrained text encoder to generate text embeddings. To fully exploit the multimodal information, we propose a Multimodal Correlation Fusion (MCF) module to generate multimodal correlations, and a Multimodal Semantic Fusion (MSF) module to refine the correlations using text-aware semantic guidance. Additionally, we propose a simple yet effective Test-time Adaptive Cross-modal Calibration (TACC) technique to mitigate training bias, further improving generalization. Experimental results on S3DIS and ScanNet datasets demonstrate significant performance improvements achieved by our method. The efficacy of our approach indicates the benefits of leveraging commonly-ignored free modalities for FS-PCS, providing valuable insights for future research. The code is available at https://github.com/ZhaochongAn/Multimodality-3D-Few-Shot .

Via

Access Paper or Ask Questions

Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift

Oct 13, 2024

Yanru Sun, Zongxia Xie, Emadeldeen Eldele, Dongyue Chen, Qinghua Hu, Min Wu

Figure 1 for Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift

Figure 2 for Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift

Figure 3 for Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift

Figure 4 for Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift

Abstract:Time series forecasting, which aims to predict future values based on historical data, has garnered significant attention due to its broad range of applications. However, real-world time series often exhibit complex non-uniform distribution with varying patterns across segments, such as season, operating condition, or semantic meaning, making accurate forecasting challenging. Existing approaches, which typically train a single model to capture all these diverse patterns, often struggle with the pattern drifts between patches and may lead to poor generalization. To address these challenges, we propose \textbf{TFPS}, a novel architecture that leverages pattern-specific experts for more accurate and adaptable time series forecasting. TFPS employs a dual-domain encoder to capture both time-domain and frequency-domain features, enabling a more comprehensive understanding of temporal dynamics. It then uses subspace clustering to dynamically identify distinct patterns across data patches. Finally, pattern-specific experts model these unique patterns, delivering tailored predictions for each patch. By explicitly learning and adapting to evolving patterns, TFPS achieves significantly improved forecasting accuracy. Extensive experiments on real-world datasets demonstrate that TFPS outperforms state-of-the-art methods, particularly in long-term forecasting, through its dynamic and pattern-aware learning approach. The data and codes are available: \url{https://github.com/syrGitHub/TFPS}.

Via

Access Paper or Ask Questions

Temporal Source Recovery for Time-Series Source-Free Unsupervised Domain Adaptation

Sep 29, 2024

Yucheng Wang, Peiliang Gong, Min Wu, Felix Ott, Xiaoli Li, Lihua Xie, Zhenghua Chen

Abstract:Source-Free Unsupervised Domain Adaptation (SFUDA) has gained popularity for its ability to adapt pretrained models to target domains without accessing source domains, ensuring source data privacy. While SFUDA is well-developed in visual tasks, its application to Time-Series SFUDA (TS-SFUDA) remains limited due to the challenge of transferring crucial temporal dependencies across domains. Although a few researchers begin to explore this area, they rely on specific source domain designs, which are impractical as source data owners cannot be expected to follow particular pretraining protocols. To solve this, we propose Temporal Source Recovery (TemSR), a framework that transfers temporal dependencies for effective TS-SFUDA without requiring source-specific designs. TemSR features a recovery process that leverages masking, recovery, and optimization to generate a source-like distribution with recovered source temporal dependencies. To ensure effective recovery, we further design segment-based regularization to restore local dependencies and anchor-based recovery diversity maximization to enhance the diversity of the source-like distribution. The source-like distribution is then adapted to the target domain using traditional UDA techniques. Extensive experiments across multiple TS tasks demonstrate the effectiveness of TemSR, even surpassing existing TS-SFUDA method that requires source domain designs. Code is available in https://github.com/Frank-Wang-oss/TemSR.

Via

Access Paper or Ask Questions

Better Verified Explanations with Applications to Incorrectness and Out-of-Distribution Detection

Sep 04, 2024

Min Wu, Xiaofu Li, Haoze Wu, Clark Barrett

Figure 1 for Better Verified Explanations with Applications to Incorrectness and Out-of-Distribution Detection

Figure 2 for Better Verified Explanations with Applications to Incorrectness and Out-of-Distribution Detection

Figure 3 for Better Verified Explanations with Applications to Incorrectness and Out-of-Distribution Detection

Figure 4 for Better Verified Explanations with Applications to Incorrectness and Out-of-Distribution Detection

Abstract:Building on VeriX (Verified eXplainability, arXiv:2212.01051), a system for producing optimal verified explanations for machine learning model outputs, we present VeriX+, which significantly improves both the size and the generation time of verified explanations. We introduce a bound propagation-based sensitivity technique to improve the size, and a binary search-based traversal with confidence ranking for improving time -- the two techniques are orthogonal and can be used independently or together. We also show how to adapt the QuickXplain (Junker 2004) algorithm to our setting to provide a trade-off between size and time. Experimental evaluations on standard benchmarks demonstrate significant improvements on both metrics, e.g., a size reduction of 38% on the GTSRB dataset and a time reduction of 90% on MNIST. We also explore applications of our verified explanations and show that explanation size is a useful proxy for both incorrectness detection and out-of-distribution detection.

Via

Access Paper or Ask Questions

Operator Feature Neural Network for Symbolic Regression

Aug 14, 2024

Yusong Deng, Min Wu, Lina Yu, Jingyi Liu, Shu Wei, Yanjie Li, Weijun Li

Abstract:Symbolic regression is a task aimed at identifying patterns in data and representing them through mathematical expressions, generally involving skeleton prediction and constant optimization. Many methods have achieved some success, however they treat variables and symbols merely as characters of natural language without considering their mathematical essence. This paper introduces the operator feature neural network (OF-Net) which employs operator representation for expressions and proposes an implicit feature encoding method for the intrinsic mathematical operational logic of operators. By substituting operator features for numeric loss, we can predict the combination of operators of target expressions. We evaluate the model on public datasets, and the results demonstrate that the model achieves superior recovery rates and high $R^2$ scores. With the discussion of the results, we analyze the merit and demerit of OF-Net and propose optimizing schemes.

* 12 pages

Via

Access Paper or Ask Questions