Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pin-Yu Chen

Breaking Focus: Contextual Distraction Curse in Large Language Models

Feb 03, 2025

Yue Huang, Yanbo Wang, Zixiang Xu, Chujie Gao, Siyuan Wu, Jiayi Ye, Xiuying Chen, Pin-Yu Chen, Xiangliang Zhang

Figure 1 for Breaking Focus: Contextual Distraction Curse in Large Language Models

Figure 2 for Breaking Focus: Contextual Distraction Curse in Large Language Models

Figure 3 for Breaking Focus: Contextual Distraction Curse in Large Language Models

Figure 4 for Breaking Focus: Contextual Distraction Curse in Large Language Models

Abstract:Recent advances in Large Language Models (LLMs) have revolutionized generative systems, achieving excellent performance across diverse domains. Although these models perform well in controlled environments, their real-world applications frequently encounter inputs containing both essential and irrelevant details. Our investigation has revealed a critical vulnerability in LLMs, which we term Contextual Distraction Vulnerability (CDV). This phenomenon arises when models fail to maintain consistent performance on questions modified with semantically coherent but irrelevant context. To systematically investigate this vulnerability, we propose an efficient tree-based search methodology to automatically generate CDV examples. Our approach successfully generates CDV examples across four datasets, causing an average performance degradation of approximately 45% in state-of-the-art LLMs. To address this critical issue, we explore various mitigation strategies and find that post-targeted training approaches can effectively enhance model robustness against contextual distractions. Our findings highlight the fundamental nature of CDV as an ability-level challenge rather than a knowledge-level issue since models demonstrate the necessary knowledge by answering correctly in the absence of distractions. This calls the community's attention to address CDV during model development to ensure reliability. The code is available at https://github.com/wyf23187/LLM_CDV.

Via

Access Paper or Ask Questions

CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling

Jan 27, 2025

Kaiyuan Zhang, Siyuan Cheng, Guangyu Shen, Bruno Ribeiro, Shengwei An, Pin-Yu Chen, Xiangyu Zhang, Ninghui Li

Figure 1 for CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling

Figure 2 for CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling

Figure 3 for CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling

Figure 4 for CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling

Abstract:Federated learning collaboratively trains a neural network on a global server, where each local client receives the current global model weights and sends back parameter updates (gradients) based on its local private data. The process of sending these model updates may leak client's private data information. Existing gradient inversion attacks can exploit this vulnerability to recover private training instances from a client's gradient vectors. Recently, researchers have proposed advanced gradient inversion techniques that existing defenses struggle to handle effectively. In this work, we present a novel defense tailored for large neural network models. Our defense capitalizes on the high dimensionality of the model parameters to perturb gradients within a subspace orthogonal to the original gradient. By leveraging cold posteriors over orthogonal subspaces, our defense implements a refined gradient update mechanism. This enables the selection of an optimal gradient that not only safeguards against gradient inversion attacks but also maintains model utility. We conduct comprehensive experiments across three different datasets and evaluate our defense against various state-of-the-art attacks and defenses. Code is available at https://censor-gradient.github.io.

* Accepted by 32nd Annual Network and Distributed System Security Symposium (NDSS 2025). Code is available at https://censor-gradient.github.io

Via

Access Paper or Ask Questions

Differentiable Prompt Learning for Vision Language Models

Dec 31, 2024

Zhenhan Huang, Tejaswini Pedapati, Pin-Yu Chen, Jianxi Gao

Figure 1 for Differentiable Prompt Learning for Vision Language Models

Figure 2 for Differentiable Prompt Learning for Vision Language Models

Figure 3 for Differentiable Prompt Learning for Vision Language Models

Figure 4 for Differentiable Prompt Learning for Vision Language Models

Abstract:Prompt learning is an effective way to exploit the potential of large-scale pre-trained foundational models. Continuous prompts parameterize context tokens in prompts by turning them into differentiable vectors. Deep continuous prompts insert prompts not only in the input but also in the intermediate hidden representations. Manually designed deep continuous prompts exhibit a remarkable improvement compared to the zero-shot pre-trained model on downstream tasks. How to automate the continuous prompt design is an underexplored area, and a fundamental question arises, is manually designed deep prompt strategy optimal? To answer this question, we propose a method dubbed differentiable prompt learning (DPL). The DPL method is formulated as an optimization problem to automatically determine the optimal context length of the prompt to be added to each layer, where the objective is to maximize the performance. We test the DPL method on the pre-trained CLIP. We empirically find that by using only limited data, our DPL method can find deep continuous prompt configuration with high confidence. The performance on the downstream tasks exhibits the superiority of the automatic design: our method boosts the average test accuracy by 2.60% on 11 datasets compared to baseline methods. Besides, our method focuses only on the prompt configuration (i.e. context length for each layer), which means that our method is compatible with the baseline methods that have sophisticated designs to boost the performance. The DPL method can be deployed to large language models or computer vision models at no cost.

Via

Access Paper or Ask Questions

Retention Score: Quantifying Jailbreak Risks for Vision Language Models

Dec 23, 2024

Zaitang Li, Pin-Yu Chen, Tsung-Yi Ho

Abstract:The emergence of Vision-Language Models (VLMs) is a significant advancement in integrating computer vision with Large Language Models (LLMs) to enhance multi-modal machine learning capabilities. However, this progress has also made VLMs vulnerable to sophisticated adversarial attacks, raising concerns about their reliability. The objective of this paper is to assess the resilience of VLMs against jailbreak attacks that can compromise model safety compliance and result in harmful outputs. To evaluate a VLM's ability to maintain its robustness against adversarial input perturbations, we propose a novel metric called the \textbf{Retention Score}. Retention Score is a multi-modal evaluation metric that includes Retention-I and Retention-T scores for quantifying jailbreak risks in visual and textual components of VLMs. Our process involves generating synthetic image-text pairs using a conditional diffusion model. These pairs are then predicted for toxicity score by a VLM alongside a toxicity judgment classifier. By calculating the margin in toxicity scores, we can quantify the robustness of the VLM in an attack-agnostic manner. Our work has four main contributions. First, we prove that Retention Score can serve as a certified robustness metric. Second, we demonstrate that most VLMs with visual components are less robust against jailbreak attacks than the corresponding plain VLMs. Additionally, we evaluate black-box VLM APIs and find that the security settings in Google Gemini significantly affect the score and robustness. Moreover, the robustness of GPT4V is similar to the medium settings of Gemini. Finally, our approach offers a time-efficient alternative to existing adversarial attack methods and provides consistent model robustness rankings when evaluated on VLMs including MiniGPT-4, InstructBLIP, and LLaVA.

* AAAI 2025
* 14 pages, 8 figures, AAAI 2025

Via

Access Paper or Ask Questions

MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models

Dec 05, 2024

Ming-Chang Chiu, Shicheng Wen, Pin-Yu Chen, Xuezhe Ma

Figure 1 for MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models

Figure 2 for MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models

Figure 3 for MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models

Figure 4 for MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models

Abstract:In vision-language models (VLMs), the ability to perceive and interpret color and physical environment is crucial for achieving contextually accurate understanding and interaction. However, despite advances in multimodal modeling, there remains a significant lack of specialized datasets that rigorously evaluate a model's capacity to discern subtle color variations and spatial context -- critical elements for situational comprehension and reliable deployment across real-world applications. Toward that goal, we curate MegaCOIN, a high-quality, human-labeled dataset based on \emph{real} images with various contextual attributes. MegaCOIN consists of two parts: MegaCOIN-Instruct, which serves as a supervised fine-tuning (SFT) dataset for VLMs; and MegaCOIN-Bench, an annotated test set that can be used as a stand-alone QA dataset. MegaCOIN~provides three annotated features for 220,000 real images: foreground color, background color, and description of an object's physical environment, constituting 660k human annotations. In addition, MegaCOIN can be applied to benchmark domain generalization (DG) algorithms. We explore benchmarking DG methods in the linear probing setup for VLM and show some new insights. Last but not least, we show that VLMs, including GPT-4o, have subpar color recognition capabilities, and fine-tuning with MegaCOIN can result in improved performance on visual evaluation tasks. In certain cases, MegaCOIN fine-tuned small-scale opensource models such as LLaVA and Bunny can outperform closed-source GPT-4o. We hope the utilities of MegaCOIN can shed light on the directions VLMs can improve and provide a more complex platform for domain generalization algorithms.

* 8 pages, 13 tables, 2 figures

Via

Access Paper or Ask Questions

Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models

Nov 28, 2024

Chung-Ting Tsai, Ching-Yun Ko, I-Hsin Chung, Yu-Chiang Frank Wang, Pin-Yu Chen

Figure 1 for Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models

Figure 2 for Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models

Figure 3 for Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models

Figure 4 for Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models

Abstract:The rapid advancement of generative models has introduced serious risks, including deepfake techniques for facial synthesis and editing. Traditional approaches rely on training classifiers and enhancing generalizability through various feature extraction techniques. Meanwhile, training-free detection methods address issues like limited data and overfitting by directly leveraging statistical properties from vision foundation models to distinguish between real and fake images. The current leading training-free approach, RIGID, utilizes DINOv2 sensitivity to perturbations in image space for detecting fake images, with fake image embeddings exhibiting greater sensitivity than those of real images. This observation prompts us to investigate how detection performance varies across model backbones, perturbation types, and datasets. Our experiments reveal that detection performance is closely linked to model robustness, with self-supervised (SSL) models providing more reliable representations. While Gaussian noise effectively detects general objects, it performs worse on facial images, whereas Gaussian blur is more effective due to potential frequency artifacts. To further improve detection, we introduce Contrastive Blur, which enhances performance on facial images, and MINDER (MINimum distance DetEctoR), which addresses noise type bias, balancing performance across domains. Beyond performance gains, our work offers valuable insights for both the generative and detection communities, contributing to a deeper understanding of model robustness property utilized for deepfake detection.

Via

Access Paper or Ask Questions

In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models

Nov 25, 2024

Zhi-Yi Chin, Kuan-Chen Mu, Mario Fritz, Pin-Yu Chen, Wei-Chen Chiu

Abstract:Text-to-image (T2I) models have shown remarkable progress, but their potential to generate harmful content remains a critical concern in the ML community. While various safety mechanisms have been developed, the field lacks systematic tools for evaluating their effectiveness against real-world misuse scenarios. In this work, we propose ICER, a novel red-teaming framework that leverages Large Language Models (LLMs) and a bandit optimization-based algorithm to generate interpretable and semantic meaningful problematic prompts by learning from past successful red-teaming attempts. Our ICER efficiently probes safety mechanisms across different T2I models without requiring internal access or additional training, making it broadly applicable to deployed systems. Through extensive experiments, we demonstrate that ICER significantly outperforms existing prompt attack methods in identifying model vulnerabilities while maintaining high semantic similarity with intended content. By uncovering that successful jailbreaking instances can systematically facilitate the discovery of new vulnerabilities, our work provides crucial insights for developing more robust safety mechanisms in T2I systems.

Via

Access Paper or Ask Questions

Quantum Machine Learning: An Interplay Between Quantum Computing and Machine Learning

Nov 14, 2024

Jun Qi, Chao-Han Yang, Samuel Yen-Chi Chen, Pin-Yu Chen

Figure 1 for Quantum Machine Learning: An Interplay Between Quantum Computing and Machine Learning

Figure 2 for Quantum Machine Learning: An Interplay Between Quantum Computing and Machine Learning

Figure 3 for Quantum Machine Learning: An Interplay Between Quantum Computing and Machine Learning

Figure 4 for Quantum Machine Learning: An Interplay Between Quantum Computing and Machine Learning

Abstract:Quantum machine learning (QML) is a rapidly growing field that combines quantum computing principles with traditional machine learning. It seeks to revolutionize machine learning by harnessing the unique capabilities of quantum mechanics and employs machine learning techniques to advance quantum computing research. This paper introduces quantum computing for the machine learning paradigm, where variational quantum circuits (VQC) are used to develop QML architectures on noisy intermediate-scale quantum (NISQ) devices. We discuss machine learning for the quantum computing paradigm, showcasing our recent theoretical and empirical findings. In particular, we delve into future directions for studying QML, exploring the potential industrial impacts of QML research.

* In submission

Via

Access Paper or Ask Questions

Leveraging Pre-Trained Neural Networks to Enhance Machine Learning with Variational Quantum Circuits

Nov 13, 2024

Jun Qi, Chao-Han Yang, Samuel Yen-Chi Chen, Pin-Yu Chen, Hector Zenil, Jesper Tegner

Abstract:Quantum Machine Learning (QML) offers tremendous potential but is currently limited by the availability of qubits. We introduce an innovative approach that utilizes pre-trained neural networks to enhance Variational Quantum Circuits (VQC). This technique effectively separates approximation error from qubit count and removes the need for restrictive conditions, making QML more viable for real-world applications. Our method significantly improves parameter optimization for VQC while delivering notable gains in representation and generalization capabilities, as evidenced by rigorous theoretical analysis and extensive empirical testing on quantum dot classification tasks. Moreover, our results extend to applications such as human genome analysis, demonstrating the broad applicability of our approach. By addressing the constraints of current quantum hardware, our work paves the way for a new era of advanced QML applications, unlocking the full potential of quantum computing in fields such as machine learning, materials science, medicine, mimetics, and various interdisciplinary areas.

* In submission

Via

Access Paper or Ask Questions

Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Nov 11, 2024

Megh Thakkar, Yash More, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, Sarath Chandar

Figure 1 for Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Figure 2 for Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Figure 3 for Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Figure 4 for Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Abstract:There is a growing interest in training domain-expert LLMs that excel in specific technical fields compared to their general-purpose instruction-tuned counterparts. However, these expert models often experience a loss in their safety abilities in the process, making them capable of generating harmful content. As a solution, we introduce an efficient and effective merging-based alignment method called \textsc{MergeAlign} that interpolates the domain and alignment vectors, creating safer domain-specific models while preserving their utility. We apply \textsc{MergeAlign} on Llama3 variants that are experts in medicine and finance, obtaining substantial alignment improvements with minimal to no degradation on domain-specific benchmarks. We study the impact of model merging through model similarity metrics and contributions of individual models being merged. We hope our findings open new research avenues and inspire more efficient development of safe expert LLMs.

Via

Access Paper or Ask Questions