Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yang Xiang

TF-Attack: Transferable and Fast Adversarial Attacks on Large Language Models

Aug 26, 2024

Zelin Li, Kehai Chen, Xuefeng Bai, Lemao Liu, Mingming Yang, Yang Xiang, Min Zhang

Abstract:With the great advancements in large language models (LLMs), adversarial attacks against LLMs have recently attracted increasing attention. We found that pre-existing adversarial attack methodologies exhibit limited transferability and are notably inefficient, particularly when applied to LLMs. In this paper, we analyze the core mechanisms of previous predominant adversarial attack methods, revealing that 1) the distributions of importance score differ markedly among victim models, restricting the transferability; 2) the sequential attack processes induces substantial time overheads. Based on the above two insights, we introduce a new scheme, named TF-Attack, for Transferable and Fast adversarial attacks on LLMs. TF-Attack employs an external LLM as a third-party overseer rather than the victim model to identify critical units within sentences. Moreover, TF-Attack introduces the concept of Importance Level, which allows for parallel substitutions of attacks. We conduct extensive experiments on 6 widely adopted benchmarks, evaluating the proposed method through both automatic and human metrics. Results show that our method consistently surpasses previous methods in transferability and delivers significant speed improvements, up to 20 times faster than earlier attack strategies.

* 14 pages, 6 figures

Via

Access Paper or Ask Questions

Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance

Aug 19, 2024

Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang

Abstract:Large language models (LLMs) have shown remarkable performance in general translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To assess the extent to which current LLMs can meet these demands, we introduce a suitable benchmark for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. Our study reveals that existing LLMs fall short of this task. To address these issues, we propose RAT, a \textbf{R}etrieval-\textbf{A}ugmented machine \textbf{T}ranslation method that enhances the translation process by incorporating knowledge related to classical poetry. Additionally, we propose an automatic evaluation metric based on GPT-4, which better assesses translation quality in terms of adequacy, fluency, and elegance, overcoming the limitations of traditional metrics. Our dataset and code will be made available.

* Work in progress

Via

Access Paper or Ask Questions

EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Jul 27, 2024

Shigang Liu, Di Cao, Junae Kim, Tamas Abraham, Paul Montague, Seyit Camtepe, Jun Zhang, Yang Xiang

Figure 1 for EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Figure 2 for EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Figure 3 for EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Figure 4 for EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Abstract:Recently, deep learning has demonstrated promising results in enhancing the accuracy of vulnerability detection and identifying vulnerabilities in software. However, these techniques are still vulnerable to attacks. Adversarial examples can exploit vulnerabilities within deep neural networks, posing a significant threat to system security. This study showcases the susceptibility of deep learning models to adversarial attacks, which can achieve 100% attack success rate (refer to Table 5). The proposed method, EaTVul, encompasses six stages: identification of important samples using support vector machines, identification of important features using the attention mechanism, generation of adversarial data based on these features using ChatGPT, preparation of an adversarial attack pool, selection of seed data using a fuzzy genetic algorithm, and the execution of an evasion attack. Extensive experiments demonstrate the effectiveness of EaTVul, achieving an attack success rate of more than 83% when the snippet size is greater than 2. Furthermore, in most cases with a snippet size of 4, EaTVul achieves a 100% attack success rate. The findings of this research emphasize the necessity of robust defenses against adversarial attacks in software vulnerability detection.

Via

Access Paper or Ask Questions

Efffcient Sensing Parameter Estimation with Direct Clutter Mitigation in Perceptive Mobile Networks

Jul 24, 2024

Hang Li, Hongming Yang, Qinghua Guo, J. Andrew Zhang, Yang Xiang, Yashan Pang

Figure 1 for Efffcient Sensing Parameter Estimation with Direct Clutter Mitigation in Perceptive Mobile Networks

Figure 2 for Efffcient Sensing Parameter Estimation with Direct Clutter Mitigation in Perceptive Mobile Networks

Figure 3 for Efffcient Sensing Parameter Estimation with Direct Clutter Mitigation in Perceptive Mobile Networks

Figure 4 for Efffcient Sensing Parameter Estimation with Direct Clutter Mitigation in Perceptive Mobile Networks

Abstract:In this work, we investigate sensing parameter estimation in the presence of clutter in perceptive mobile networks (PMNs) that integrate radar sensing into mobile communications. Performing clutter suppression before sensing parameter estimation is generally desirable as the number of sensing parameters can be signiffcantly reduced. However, existing methods require high-complexity clutter mitigation and sensing parameter estimation, where clutter is ffrstly identiffed and then removed. In this correspondence, we propose a much simpler but more effective method by incorporating a clutter cancellation mechanism in formulating a sparse signal model for sensing parameter estimation. In particular, clutter mitigation is performed directly on the received signals and the unitary approximate message passing (UAMP) is leveraged to exploit the common support for sensing parameter estimation in the formulated sparse signal recovery problem. Simulation results show that, compared to state-of-theart methods, the proposed method delivers signiffcantly better performance while with substantially reduced complexity.

Via

Access Paper or Ask Questions

A fast neural hybrid Newton solver adapted to implicit methods for nonlinear dynamics

Jul 04, 2024

Tianyu Jin, Georg Maierhofer, Katharina Schratz, Yang Xiang

Figure 1 for A fast neural hybrid Newton solver adapted to implicit methods for nonlinear dynamics

Figure 2 for A fast neural hybrid Newton solver adapted to implicit methods for nonlinear dynamics

Figure 3 for A fast neural hybrid Newton solver adapted to implicit methods for nonlinear dynamics

Figure 4 for A fast neural hybrid Newton solver adapted to implicit methods for nonlinear dynamics

Abstract:The use of implicit time-stepping schemes for the numerical approximation of solutions to stiff nonlinear time-evolution equations brings well-known advantages including, typically, better stability behaviour and corresponding support of larger time steps, and better structure preservation properties. However, this comes at the price of having to solve a nonlinear equation at every time step of the numerical scheme. In this work, we propose a novel operator learning based hybrid Newton's method to accelerate this solution of the nonlinear time step system for stiff time-evolution nonlinear equations. We propose a targeted learning strategy which facilitates robust unsupervised learning in an offline phase and provides a highly efficient initialisation for the Newton iteration leading to consistent acceleration of Newton's method. A quantifiable rate of improvement in Newton's method achieved by improved initialisation is provided and we analyse the upper bound of the generalisation error of our unsupervised learning strategy. These theoretical results are supported by extensive numerical results, demonstrating the efficiency of our proposed neural hybrid solver both in one- and two-dimensional cases.

Via

Access Paper or Ask Questions

Paying More Attention to Source Context: Mitigating Unfaithful Translations from Large Language Model

Jun 11, 2024

Hongbin Zhang, Kehai Chen, Xuefeng Bai, Yang Xiang, Min Zhang

Abstract:Large language models (LLMs) have showcased impressive multilingual machine translation ability. However, unlike encoder-decoder style models, decoder-only LLMs lack an explicit alignment between source and target contexts. Analyzing contribution scores during generation processes revealed that LLMs can be biased towards previously generated tokens over corresponding source tokens, leading to unfaithful translations. To address this issue, we propose to encourage LLMs to pay more attention to the source context from both source and target perspectives in zeroshot prompting: 1) adjust source context attention weights; 2) suppress irrelevant target prefix influence; Additionally, we propose 3) avoiding over-reliance on the target prefix in instruction tuning. Experimental results from both human-collected unfaithfulness test sets focusing on LLM-generated unfaithful translations and general test sets, verify our methods' effectiveness across multiple language pairs. Further human evaluation shows our method's efficacy in reducing hallucinatory translations and facilitating faithful translation generation.

* Accepted by ACL2024 Findings

Via

Access Paper or Ask Questions

DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms

Jun 11, 2024

Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang

Abstract:Recently, large language models (LLMs) enhanced by self-reflection have achieved promising performance on machine translation. The key idea is guiding LLMs to generate translation with human-like feedback. However, existing self-reflection methods lack effective feedback information, limiting the translation performance. To address this, we introduce a DUAL-REFLECT framework, leveraging the dual learning of translation tasks to provide effective feedback, thereby enhancing the models' self-reflective abilities and improving translation performance. The application of this method across various translation tasks has proven its effectiveness in improving translation accuracy and eliminating ambiguities, especially in translation tasks with low-resource language pairs.

* Accepted to ACL 2024 main conference

Via

Access Paper or Ask Questions

Memorization in deep learning: A survey

Jun 06, 2024

Jiaheng Wei, Yanjun Zhang, Leo Yu Zhang, Ming Ding, Chao Chen, Kok-Leong Ong, Jun Zhang, Yang Xiang

Figure 1 for Memorization in deep learning: A survey

Figure 2 for Memorization in deep learning: A survey

Figure 3 for Memorization in deep learning: A survey

Figure 4 for Memorization in deep learning: A survey

Abstract:Deep Learning (DL) powered by Deep Neural Networks (DNNs) has revolutionized various domains, yet understanding the intricacies of DNN decision-making and learning processes remains a significant challenge. Recent investigations have uncovered an interesting memorization phenomenon in which DNNs tend to memorize specific details from examples rather than learning general patterns, affecting model generalization, security, and privacy. This raises critical questions about the nature of generalization in DNNs and their susceptibility to security breaches. In this survey, we present a systematic framework to organize memorization definitions based on the generalization and security/privacy domains and summarize memorization evaluation methods at both the example and model levels. Through a comprehensive literature review, we explore DNN memorization behaviors and their impacts on security and privacy. We also introduce privacy vulnerabilities caused by memorization and the phenomenon of forgetting and explore its connection with memorization. Furthermore, we spotlight various applications leveraging memorization and forgetting mechanisms, including noisy label learning, privacy preservation, and model enhancement. This survey offers the first-in-kind understanding of memorization in DNNs, providing insights into its challenges and opportunities for enhancing AI development while addressing critical ethical concerns.

Via

Access Paper or Ask Questions

AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

Jun 04, 2024

Zehang Deng, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, Yang Xiang

Abstract:An Artificial Intelligence (AI) agent is a software entity that autonomously performs tasks or makes decisions based on pre-defined objectives and data inputs. AI agents, capable of perceiving user inputs, reasoning and planning tasks, and executing actions, have seen remarkable advancements in algorithm development and task performance. However, the security challenges they pose remain under-explored and unresolved. This survey delves into the emerging security threats faced by AI agents, categorizing them into four critical knowledge gaps: unpredictability of multi-step user inputs, complexity in internal executions, variability of operational environments, and interactions with untrusted external entities. By systematically reviewing these threats, this paper highlights both the progress made and the existing limitations in safeguarding AI agents. The insights provided aim to inspire further research into addressing the security threats associated with AI agents, thereby fostering the development of more robust and secure AI agent applications.

* ACM Computing Survey

Via

Access Paper or Ask Questions

Automatic Differentiation is Essential in Training Neural Networks for Solving Differential Equations

May 23, 2024

Chuqi Chen, Yahong Yang, Yang Xiang, Wenrui Hao

Figure 1 for Automatic Differentiation is Essential in Training Neural Networks for Solving Differential Equations

Figure 2 for Automatic Differentiation is Essential in Training Neural Networks for Solving Differential Equations

Figure 3 for Automatic Differentiation is Essential in Training Neural Networks for Solving Differential Equations

Figure 4 for Automatic Differentiation is Essential in Training Neural Networks for Solving Differential Equations

Abstract:Neural network-based approaches have recently shown significant promise in solving partial differential equations (PDEs) in science and engineering, especially in scenarios featuring complex domains or the incorporation of empirical data. One advantage of the neural network method for PDEs lies in its automatic differentiation (AD), which necessitates only the sample points themselves, unlike traditional finite difference (FD) approximations that require nearby local points to compute derivatives. In this paper, we quantitatively demonstrate the advantage of AD in training neural networks. The concept of truncated entropy is introduced to characterize the training property. Specifically, through comprehensive experimental and theoretical analyses conducted on random feature models and two-layer neural networks, we discover that the defined truncated entropy serves as a reliable metric for quantifying the residual loss of random feature models and the training speed of neural networks for both AD and FD methods. Our experimental and theoretical analyses demonstrate that, from a training perspective, AD outperforms FD in solving partial differential equations.

Via

Access Paper or Ask Questions