Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ren Wang

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Jun 09, 2024

Wei Li, Pin-Yu Chen, Sijia Liu, Ren Wang

Abstract:Deep neural networks are susceptible to backdoor attacks, where adversaries manipulate model predictions by inserting malicious samples into the training data. Currently, there is still a lack of direct filtering methods for identifying suspicious training data to unveil potential backdoor samples. In this paper, we propose a novel method, Prediction Shift Backdoor Detection (PSBD), leveraging an uncertainty-based approach requiring minimal unlabeled clean validation data. PSBD is motivated by an intriguing Prediction Shift (PS) phenomenon, where poisoned models' predictions on clean data often shift away from true labels towards certain other labels with dropout applied during inference, while backdoor samples exhibit less PS. We hypothesize PS results from neuron bias effect, making neurons favor features of certain classes. PSBD identifies backdoor training samples by computing the Prediction Shift Uncertainty (PSU), the variance in probability values when dropout layers are toggled on and off during model inference. Extensive experiments have been conducted to verify the effectiveness and efficiency of PSBD, which achieves state-of-the-art results among mainstream detection methods.

Via

Access Paper or Ask Questions

Securing GNNs: Explanation-Based Identification of Backdoored Training Graphs

Mar 26, 2024

Jane Downer, Ren Wang, Binghui Wang

Figure 1 for Securing GNNs: Explanation-Based Identification of Backdoored Training Graphs

Figure 2 for Securing GNNs: Explanation-Based Identification of Backdoored Training Graphs

Figure 3 for Securing GNNs: Explanation-Based Identification of Backdoored Training Graphs

Figure 4 for Securing GNNs: Explanation-Based Identification of Backdoored Training Graphs

Abstract:Graph Neural Networks (GNNs) have gained popularity in numerous domains, yet they are vulnerable to backdoor attacks that can compromise their performance and ethical application. The detection of these attacks is crucial for maintaining the reliability and security of GNN classification tasks, but effective detection techniques are lacking. Following an initial investigation, we observed that while graph-level explanations can offer limited insights, their effectiveness in detecting backdoor triggers is inconsistent and incomplete. To bridge this gap, we extract and transform secondary outputs of GNN explanation mechanisms, designing seven novel metrics that more effectively detect backdoor attacks. Additionally, we develop an adaptive attack to rigorously evaluate our approach. We test our method on multiple benchmark datasets and examine its efficacy against various attack models. Our results show that our method can achieve high detection performance, marking a significant advancement in safeguarding GNNs against backdoor attacks.

Via

Access Paper or Ask Questions

Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency

Mar 15, 2024

Soumyadeep Pal, Yuguang Yao, Ren Wang, Bingquan Shen, Sijia Liu

Figure 1 for Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency

Figure 2 for Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency

Figure 3 for Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency

Figure 4 for Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency

Abstract:Modern machine learning (ML) systems demand substantial training data, often resorting to external sources. Nevertheless, this practice renders them vulnerable to backdoor poisoning attacks. Prior backdoor defense strategies have primarily focused on the identification of backdoored models or poisoned data characteristics, typically operating under the assumption of access to clean data. In this work, we delve into a relatively underexplored challenge: the automatic identification of backdoor data within a poisoned dataset, all under realistic conditions, i.e., without the need for additional clean data or without manually defining a threshold for backdoor detection. We draw an inspiration from the scaled prediction consistency (SPC) technique, which exploits the prediction invariance of poisoned data to an input scaling factor. Based on this, we pose the backdoor data identification problem as a hierarchical data splitting optimization problem, leveraging a novel SPC-based loss function as the primary optimization objective. Our innovation unfolds in several key aspects. First, we revisit the vanilla SPC method, unveiling its limitations in addressing the proposed backdoor identification problem. Subsequently, we develop a bi-level optimization-based approach to precisely identify backdoor data by minimizing the advanced SPC loss. Finally, we demonstrate the efficacy of our proposal against a spectrum of backdoor attacks, encompassing basic label-corrupted attacks as well as more sophisticated clean-label attacks, evaluated across various benchmark datasets. Experiment results show that our approach often surpasses the performance of current baselines in identifying backdoor data points, resulting in about 4%-36% improvement in average AUROC. Codes are available at https://github.com/OPTML-Group/BackdoorMSPC.

* The Twelfth International Conference on Learning Representations (ICLR 2024)

Via

Access Paper or Ask Questions

Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

Feb 22, 2024

Zhiyuan Wang, Jinhao Duan, Chenxi Yuan, Qingyu Chen, Tianlong Chen, Huaxiu Yao, Yue Zhang, Ren Wang, Kaidi Xu, Xiaoshuang Shi

Figure 1 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

Figure 2 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

Figure 3 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

Figure 4 for Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

Abstract:Uncertainty estimation plays a pivotal role in ensuring the reliability of safety-critical human-AI interaction systems, particularly in the medical domain. However, a general method for quantifying the uncertainty of free-form answers has yet to be established in open-ended medical question-answering (QA) tasks, where irrelevant words and sequences with limited semantic information can be the primary source of uncertainty due to the presence of generative inequality. In this paper, we propose the Word-Sequence Entropy (WSE), which calibrates the uncertainty proportion at both the word and sequence levels according to the semantic relevance, with greater emphasis placed on keywords and more relevant sequences when performing uncertainty quantification. We compare WSE with 6 baseline methods on 5 free-form medical QA datasets, utilizing 7 "off-the-shelf" large language models (LLMs), and show that WSE exhibits superior performance on accurate uncertainty measurement under two standard criteria for correctness evaluation (e.g., WSE outperforms existing state-of-the-art method by 3.23% AUROC on the MedQA dataset). Additionally, in terms of the potential for real-world medical QA applications, we achieve a significant enhancement in the performance of LLMs when employing sequences with lower uncertainty, identified by WSE, as final answers (e.g., +6.36% accuracy improvement on the COVID-QA dataset), without requiring any additional task-specific fine-tuning or architectural modifications.

* 18 pages

Via

Access Paper or Ask Questions

PINNs-Based Uncertainty Quantification for Transient Stability Analysis

Nov 21, 2023

Ren Wang, Ming Zhong, Kaidi Xu, Lola Giráldez Sánchez-Cortés, Ignacio de Cominges Guerra

Abstract:This paper addresses the challenge of transient stability in power systems with missing parameters and uncertainty propagation in swing equations. We introduce a novel application of Physics-Informed Neural Networks (PINNs), specifically an Ensemble of PINNs (E-PINNs), to estimate critical parameters like rotor angle and inertia coefficient with enhanced accuracy and reduced computational load. E-PINNs capitalize on the underlying physical principles of swing equations to provide a robust solution. Our approach not only facilitates efficient parameter estimation but also quantifies uncertainties, delivering probabilistic insights into the system behavior. The efficacy of E-PINNs is demonstrated through the analysis of $1$-bus and $2$-bus systems, highlighting the model's ability to handle parameter variability and data scarcity. The study advances the application of machine learning in power system stability, paving the way for reliable and computationally efficient transient stability analysis.

Via

Access Paper or Ask Questions

Efficient and Effective Multi-View Subspace Clustering for Large-scale Data

Oct 15, 2023

Yuxiu Lin, Hui Liu, Ren Wang, Gongguan Chen, Caiming Zhang

Figure 1 for Efficient and Effective Multi-View Subspace Clustering for Large-scale Data

Figure 2 for Efficient and Effective Multi-View Subspace Clustering for Large-scale Data

Figure 3 for Efficient and Effective Multi-View Subspace Clustering for Large-scale Data

Figure 4 for Efficient and Effective Multi-View Subspace Clustering for Large-scale Data

Abstract:Recent multi-view subspace clustering achieves impressive results utilizing deep networks, where the self-expressive correlation is typically modeled by a fully connected (FC) layer. However, they still suffer from two limitations: i) it is under-explored to extract a unified representation from multiple views that simultaneously satisfy minimal sufficiency and discriminability. ii) the parameter scale of the FC layer is quadratic to the number of samples, resulting in high time and memory costs that significantly degrade their feasibility in large-scale datasets. In light of this, we propose a novel deep framework termed Efficient and Effective Large-scale Multi-View Subspace Clustering (E$^2$LMVSC). Specifically, to enhance the quality of the unified representation, a soft clustering assignment similarity constraint is devised for explicitly decoupling consistent, complementary, and superfluous information across multi-view data. Then, following information bottleneck theory, a sufficient yet minimal unified feature representation is obtained. Moreover, E$^2$LMVSC employs the maximal coding rate reduction principle to promote intra-cluster aggregation and inter-cluster separability within the unified representation. Finally, the self-expressive coefficients are learned by a Relation-Metric Net instead of a parameterized FC layer for greater efficiency. Extensive experiments show that E$^2$LMVSC yields comparable results to existing methods and achieves state-of-the-art clustering performance in large-scale multi-view datasets.

Via

Access Paper or Ask Questions

Improving Generalization in Meta-Learning via Meta-Gradient Augmentation

Jun 14, 2023

Ren Wang, Haoliang Sun, Qi Wei, Xiushan Nie, Yuling Ma, Yilong Yin

Figure 1 for Improving Generalization in Meta-Learning via Meta-Gradient Augmentation

Figure 2 for Improving Generalization in Meta-Learning via Meta-Gradient Augmentation

Figure 3 for Improving Generalization in Meta-Learning via Meta-Gradient Augmentation

Figure 4 for Improving Generalization in Meta-Learning via Meta-Gradient Augmentation

Abstract:Meta-learning methods typically follow a two-loop framework, where each loop potentially suffers from notorious overfitting, hindering rapid adaptation and generalization to new tasks. Existing schemes solve it by enhancing the mutual-exclusivity or diversity of training samples, but these data manipulation strategies are data-dependent and insufficiently flexible. This work alleviates overfitting in meta-learning from the perspective of gradient regularization and proposes a data-independent \textbf{M}eta-\textbf{G}radient \textbf{Aug}mentation (\textbf{MGAug}) method. The key idea is to first break the rote memories by network pruning to address memorization overfitting in the inner loop, and then the gradients of pruned sub-networks naturally form the high-quality augmentation of the meta-gradient to alleviate learner overfitting in the outer loop. Specifically, we explore three pruning strategies, including \textit{random width pruning}, \textit{random parameter pruning}, and a newly proposed \textit{catfish pruning} that measures a Meta-Memorization Carrying Amount (MMCA) score for each parameter and prunes high-score ones to break rote memories as much as possible. The proposed MGAug is theoretically guaranteed by the generalization bound from the PAC-Bayes framework. In addition, we extend a lightweight version, called MGAug-MaxUp, as a trade-off between performance gains and resource overhead. Extensive experiments on multiple few-shot learning benchmarks validate MGAug's effectiveness and significant improvement over various meta-baselines. The code is publicly available at \url{https://github.com/xxLifeLover/Meta-Gradient-Augmentation}.

Via

Access Paper or Ask Questions

Robust Mode Connectivity-Oriented Adversarial Defense: Enhancing Neural Network Robustness Against Diversified $\ell_p$ Attacks

Mar 17, 2023

Ren Wang, Yuxuan Li, Sijia Liu

$Figure 1 for Robust Mode Connectivity-Oriented Adversarial Defense: Enhancing Neural Network Robustness Against Diversified $\ell_p$ Attacks$

$Figure 2 for Robust Mode Connectivity-Oriented Adversarial Defense: Enhancing Neural Network Robustness Against Diversified $\ell_p$ Attacks$

$Figure 3 for Robust Mode Connectivity-Oriented Adversarial Defense: Enhancing Neural Network Robustness Against Diversified $\ell_p$ Attacks$

$Figure 4 for Robust Mode Connectivity-Oriented Adversarial Defense: Enhancing Neural Network Robustness Against Diversified $\ell_p$ Attacks$

Abstract:Adversarial robustness is a key concept in measuring the ability of neural networks to defend against adversarial attacks during the inference phase. Recent studies have shown that despite the success of improving adversarial robustness against a single type of attack using robust training techniques, models are still vulnerable to diversified $\ell_p$ attacks. To achieve diversified $\ell_p$ robustness, we propose a novel robust mode connectivity (RMC)-oriented adversarial defense that contains two population-based learning phases. The first phase, RMC, is able to search the model parameter space between two pre-trained models and find a path containing points with high robustness against diversified $\ell_p$ attacks. In light of the effectiveness of RMC, we develop a second phase, RMC-based optimization, with RMC serving as the basic unit for further enhancement of neural network diversified $\ell_p$ robustness. To increase computational efficiency, we incorporate learning with a self-robust mode connectivity (SRMC) module that enables the fast proliferation of the population used for endpoints of RMC. Furthermore, we draw parallels between SRMC and the human immune system. Experimental results on various datasets and model architectures demonstrate that the proposed defense methods can achieve high diversified $\ell_p$ robustness against $\ell_\infty$, $\ell_2$, $\ell_1$, and hybrid attacks. Codes are available at \url{https://github.com/wangren09/MCGR}.

Via

Access Paper or Ask Questions

MetaViewer: Towards A Unified Multi-View Representation

Mar 11, 2023

Ren Wang, Haoliang Sun, Yuling Ma, Xiaoming Xi, Yilong Yin

Figure 1 for MetaViewer: Towards A Unified Multi-View Representation

Figure 2 for MetaViewer: Towards A Unified Multi-View Representation

Figure 3 for MetaViewer: Towards A Unified Multi-View Representation

Figure 4 for MetaViewer: Towards A Unified Multi-View Representation

Abstract:Existing multi-view representation learning methods typically follow a specific-to-uniform pipeline, extracting latent features from each view and then fusing or aligning them to obtain the unified object representation. However, the manually pre-specify fusion functions and view-private redundant information mixed in features potentially degrade the quality of the derived representation. To overcome them, we propose a novel bi-level-optimization-based multi-view learning framework, where the representation is learned in a uniform-to-specific manner. Specifically, we train a meta-learner, namely MetaViewer, to learn fusion and model the view-shared meta representation in outer-level optimization. Start with this meta representation, view-specific base-learners are then required to rapidly reconstruct the corresponding view in inner-level. MetaViewer eventually updates by observing reconstruction processes from uniform to specific over all views, and learns an optimal fusion scheme that separates and filters out view-private information. Extensive experimental results in downstream tasks such as classification and clustering demonstrate the effectiveness of our method.

* 8 pages, 5 figures, conference

Via

Access Paper or Ask Questions

Fine-Grained Classification with Noisy Labels

Mar 04, 2023

Qi Wei, Lei Feng, Haoliang Sun, Ren Wang, Chenhui Guo, Yilong Yin

Figure 1 for Fine-Grained Classification with Noisy Labels

Figure 2 for Fine-Grained Classification with Noisy Labels

Figure 3 for Fine-Grained Classification with Noisy Labels

Figure 4 for Fine-Grained Classification with Noisy Labels

Abstract:Learning with noisy labels (LNL) aims to ensure model generalization given a label-corrupted training set. In this work, we investigate a rarely studied scenario of LNL on fine-grained datasets (LNL-FG), which is more practical and challenging as large inter-class ambiguities among fine-grained classes cause more noisy labels. We empirically show that existing methods that work well for LNL fail to achieve satisfying performance for LNL-FG, arising the practical need of effective solutions for LNL-FG. To this end, we propose a novel framework called stochastic noise-tolerated supervised contrastive learning (SNSCL) that confronts label noise by encouraging distinguishable representation. Specifically, we design a noise-tolerated supervised contrastive learning loss that incorporates a weight-aware mechanism for noisy label correction and selectively updating momentum queue lists. By this mechanism, we mitigate the effects of noisy anchors and avoid inserting noisy labels into the momentum-updated queue. Besides, to avoid manually-defined augmentation strategies in contrastive learning, we propose an efficient stochastic module that samples feature embeddings from a generated distribution, which can also enhance the representation ability of deep models. SNSCL is general and compatible with prevailing robust LNL strategies to improve their performance for LNL-FG. Extensive experiments demonstrate the effectiveness of SNSCL.

* Accepted to CVPR 2023

Via

Access Paper or Ask Questions