Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yang Liu

Addressing Polarization and Unfairness in Performative Prediction

Jun 24, 2024

Kun Jin, Tian Xie, Yang Liu, Xueru Zhang

Figure 1 for Addressing Polarization and Unfairness in Performative Prediction

Figure 2 for Addressing Polarization and Unfairness in Performative Prediction

Figure 3 for Addressing Polarization and Unfairness in Performative Prediction

Figure 4 for Addressing Polarization and Unfairness in Performative Prediction

Abstract:When machine learning (ML) models are used in applications that involve humans (e.g., online recommendation, school admission, hiring, lending), the model itself may trigger changes in the distribution of targeted data it aims to predict. Performative prediction (PP) is a framework that explicitly considers such model-dependent distribution shifts when learning ML models. While significant efforts have been devoted to finding performative stable (PS) solutions in PP for system robustness, their societal implications are less explored and it is unclear whether PS solutions are aligned with social norms such as fairness. In this paper, we set out to examine the fairness property of PS solutions in performative prediction. We first show that PS solutions can incur severe polarization effects and group-wise loss disparity. Although existing fairness mechanisms commonly used in literature can help mitigate unfairness, they may fail and disrupt the stability under model-dependent distribution shifts. We thus propose novel fairness intervention mechanisms that can simultaneously achieve both stability and fairness in PP settings. Both theoretical analysis and experiments are provided to validate the proposed method.

Via

Access Paper or Ask Questions

Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

Jun 23, 2024

Yuwei Zhang, Tong Xia, Jing Han, Yu Wu, Georgios Rizos, Yang Liu, Mohammed Mosuily, Jagmohan Chauhan, Cecilia Mascolo

Figure 1 for Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

Figure 2 for Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

Figure 3 for Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

Figure 4 for Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

Abstract:Respiratory audio, such as coughing and breathing sounds, has predictive power for a wide range of healthcare applications, yet is currently under-explored. The main problem for those applications arises from the difficulty in collecting large labeled task-specific data for model development. Generalizable respiratory acoustic foundation models pretrained with unlabeled data would offer appealing advantages and possibly unlock this impasse. However, given the safety-critical nature of healthcare applications, it is pivotal to also ensure openness and replicability for any proposed foundation model solution. To this end, we introduce OPERA, an OPEn Respiratory Acoustic foundation model pretraining and benchmarking system, as the first approach answering this need. We curate large-scale respiratory audio datasets (~136K samples, 440 hours), pretrain three pioneering foundation models, and build a benchmark consisting of 19 downstream respiratory health tasks for evaluation. Our pretrained models demonstrate superior performance (against existing acoustic models pretrained with general audio on 16 out of 19 tasks) and generalizability (to unseen datasets and new respiratory audio modalities). This highlights the great promise of respiratory acoustic foundation models and encourages more studies using OPERA as an open resource to accelerate research on respiratory audio for health. The system is accessible from https://github.com/evelyn0414/OPERA.

Via

Access Paper or Ask Questions

FuseGen: PLM Fusion for Data-generation based Zero-shot Learning

Jun 18, 2024

Tianyuan Zou, Yang Liu, Peng Li, Jianqing Zhang, Jingjing Liu, Ya-Qin Zhang

Figure 1 for FuseGen: PLM Fusion for Data-generation based Zero-shot Learning

Figure 2 for FuseGen: PLM Fusion for Data-generation based Zero-shot Learning

Figure 3 for FuseGen: PLM Fusion for Data-generation based Zero-shot Learning

Figure 4 for FuseGen: PLM Fusion for Data-generation based Zero-shot Learning

Abstract:Data generation-based zero-shot learning, although effective in training Small Task-specific Models (STMs) via synthetic datasets generated by Pre-trained Language Models (PLMs), is often limited by the low quality of such synthetic datasets. Previous solutions have primarily focused on single PLM settings, where synthetic datasets are typically restricted to specific sub-spaces and often deviate from real-world distributions, leading to severe distribution bias. To mitigate such bias, we propose FuseGen, a novel data generation-based zero-shot learning framework that introduces a new criteria for subset selection from synthetic datasets via utilizing multiple PLMs and trained STMs. The chosen subset provides in-context feedback to each PLM, enhancing dataset quality through iterative data generation. Trained STMs are then used for sample re-weighting as well, further improving data quality. Extensive experiments across diverse tasks demonstrate that FuseGen substantially outperforms existing methods, highly effective in boosting STM performance in a PLM-agnostic way. Code is provided in https://github.com/LindaLydia/FuseGen.

* 17 pages, 8 figures, 12 tabels

Via

Access Paper or Ask Questions

Toward Optimal LLM Alignments Using Two-Player Games

Jun 16, 2024

Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang(+3 more)

Figure 1 for Toward Optimal LLM Alignments Using Two-Player Games

Figure 2 for Toward Optimal LLM Alignments Using Two-Player Games

Figure 3 for Toward Optimal LLM Alignments Using Two-Player Games

Figure 4 for Toward Optimal LLM Alignments Using Two-Player Games

Abstract:The standard Reinforcement Learning from Human Feedback (RLHF) framework primarily focuses on optimizing the performance of large language models using pre-collected prompts. However, collecting prompts that provide comprehensive coverage is both tedious and challenging, and often fails to include scenarios that LLMs need to improve on the most. In this paper, we investigate alignment through the lens of two-agent games, involving iterative interactions between an adversarial and a defensive agent. The adversarial agent's task at each step is to generate prompts that expose the weakness of the defensive agent. In return, the defensive agent seeks to improve its responses to these newly identified prompts it struggled with, based on feedback from the reward model. We theoretically demonstrate that this iterative reinforcement learning optimization converges to a Nash Equilibrium for the game induced by the agents. Experimental results in safety scenarios demonstrate that learning in such a competitive environment not only fully trains agents but also leads to policies with enhanced generalization capabilities for both adversarial and defensive agents.

* Our code is released at https://github.com/ruizheng20/gpo

Via

Access Paper or Ask Questions

Technique Report of CVPR 2024 PBDL Challenges

Jun 15, 2024

Ying Fu, Yu Li, Shaodi You, Boxin Shi, Jose Alvarez, Coert van Gemeren, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li(+91 more)

Figure 1 for Technique Report of CVPR 2024 PBDL Challenges

Figure 2 for Technique Report of CVPR 2024 PBDL Challenges

Figure 3 for Technique Report of CVPR 2024 PBDL Challenges

Figure 4 for Technique Report of CVPR 2024 PBDL Challenges

Abstract:The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches.

* CVPR 2024 Workshop - PBDL Challenge Report

Via

Access Paper or Ask Questions

Large Language Model Unlearning via Embedding-Corrupted Prompts

Jun 12, 2024

Chris Yuhao Liu, Yaxuan Wang, Jeffrey Flanigan, Yang Liu

Figure 1 for Large Language Model Unlearning via Embedding-Corrupted Prompts

Figure 2 for Large Language Model Unlearning via Embedding-Corrupted Prompts

Figure 3 for Large Language Model Unlearning via Embedding-Corrupted Prompts

Figure 4 for Large Language Model Unlearning via Embedding-Corrupted Prompts

Abstract:Large language models (LLMs) have advanced to encompass extensive knowledge across diverse domains. Yet controlling what a large language model should not know is important for ensuring alignment and thus safe use. However, accurately and efficiently unlearning knowledge from an LLM remains challenging due to the potential collateral damage caused by the fuzzy boundary between retention and forgetting, and the large computational requirements for optimization across state-of-the-art models with hundreds of billions of parameters. In this work, we present Embedding-COrrupted (ECO) Prompts, a lightweight unlearning framework for large language models to address both the challenges of knowledge entanglement and unlearning efficiency. Instead of relying on the LLM itself to unlearn, we enforce an unlearned state during inference by employing a prompt classifier to identify and safeguard prompts to forget. We learn corruptions added to prompt embeddings via zeroth order optimization toward the unlearning objective offline and corrupt prompts flagged by the classifier during inference. We find that these embedding-corrupted prompts not only lead to desirable outputs that satisfy the unlearning objective but also closely approximate the output from a model that has never been trained on the data intended for forgetting. Through extensive experiments on unlearning, we demonstrate the superiority of our method in achieving promising unlearning at nearly zero side effects in general domains and domains closely related to the unlearned ones. Additionally, we highlight the scalability of our method to 100 LLMs, ranging from 0.5B to 236B parameters, incurring no additional cost as the number of parameters increases.

* 55 pages, 4 figures, 66 tables

Via

Access Paper or Ask Questions

Label Smoothing Improves Machine Unlearning

Jun 11, 2024

Zonglin Di, Zhaowei Zhu, Jinghan Jia, Jiancheng Liu, Zafar Takhirov, Bo Jiang, Yuanshun Yao, Sijia Liu, Yang Liu

Figure 1 for Label Smoothing Improves Machine Unlearning

Figure 2 for Label Smoothing Improves Machine Unlearning

Figure 3 for Label Smoothing Improves Machine Unlearning

Figure 4 for Label Smoothing Improves Machine Unlearning

Abstract:The objective of machine unlearning (MU) is to eliminate previously learned data from a model. However, it is challenging to strike a balance between computation cost and performance when using existing MU techniques. Taking inspiration from the influence of label smoothing on model confidence and differential privacy, we propose a simple gradient-based MU approach that uses an inverse process of label smoothing. This work introduces UGradSL, a simple, plug-and-play MU approach that uses smoothed labels. We provide theoretical analyses demonstrating why properly introducing label smoothing improves MU performance. We conducted extensive experiments on six datasets of various sizes and different modalities, demonstrating the effectiveness and robustness of our proposed method. The consistent improvement in MU performance is only at a marginal cost of additional computations. For instance, UGradSL improves over the gradient ascent MU baseline by 66% unlearning accuracy without sacrificing unlearning efficiency.

Via

Access Paper or Ask Questions

Adversarial Machine Unlearning

Jun 11, 2024

Zonglin Di, Sixie Yu, Yevgeniy Vorobeychik, Yang Liu

Figure 1 for Adversarial Machine Unlearning

Figure 2 for Adversarial Machine Unlearning

Figure 3 for Adversarial Machine Unlearning

Figure 4 for Adversarial Machine Unlearning

Abstract:This paper focuses on the challenge of machine unlearning, aiming to remove the influence of specific training data on machine learning models. Traditionally, the development of unlearning algorithms runs parallel with that of membership inference attacks (MIA), a type of privacy threat to determine whether a data instance was used for training. However, the two strands are intimately connected: one can view machine unlearning through the lens of MIA success with respect to removed data. Recognizing this connection, we propose a game-theoretic framework that integrates MIAs into the design of unlearning algorithms. Specifically, we model the unlearning problem as a Stackelberg game in which an unlearner strives to unlearn specific training data from a model, while an auditor employs MIAs to detect the traces of the ostensibly removed data. Adopting this adversarial perspective allows the utilization of new attack advancements, facilitating the design of unlearning algorithms. Our framework stands out in two ways. First, it takes an adversarial approach and proactively incorporates the attacks into the design of unlearning algorithms. Secondly, it uses implicit differentiation to obtain the gradients that limit the attacker's success, thus benefiting the process of unlearning. We present empirical results to demonstrate the effectiveness of the proposed approach for machine unlearning.

Via

Access Paper or Ask Questions

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Jun 11, 2024

Liliang Ren, Yang Liu, Yadong Lu, Yelong Shen, Chen Liang, Weizhu Chen

Figure 1 for Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Figure 2 for Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Figure 3 for Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Figure 4 for Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Abstract:Efficiently modeling sequences with infinite context length has been a long-standing problem. Past works suffer from either the quadratic computation complexity or the limited extrapolation ability on length generalization. In this work, we present Samba, a simple hybrid architecture that layer-wise combines Mamba, a selective State Space Model (SSM), with Sliding Window Attention (SWA). Samba selectively compresses a given sequence into recurrent hidden states while still maintaining the ability to precisely recall memories with the attention mechanism. We scale Samba up to 3.8B parameters with 3.2T training tokens and show that Samba substantially outperforms the state-of-the-art models based on pure attention or SSMs on a wide range of benchmarks. When trained on 4K length sequences, Samba can be efficiently extrapolated to 256K context length with perfect memory recall and show improved token predictions up to 1M context length. As a linear-time sequence model, Samba enjoys a 3.73x higher throughput compared to Transformers with grouped-query attention when processing user prompts of 128K length, and 3.64x speedup when generating 64K tokens with unlimited streaming. A sample implementation of Samba is publicly available in https://github.com/microsoft/Samba.

Via

Access Paper or Ask Questions

Texture Re-scalable Universal Adversarial Perturbation

Jun 10, 2024

Yihao Huang, Qing Guo, Felix Juefei-Xu, Ming Hu, Xiaojun Jia, Xiaochun Cao, Geguang Pu, Yang Liu

Abstract:Universal adversarial perturbation (UAP), also known as image-agnostic perturbation, is a fixed perturbation map that can fool the classifier with high probabilities on arbitrary images, making it more practical for attacking deep models in the real world. Previous UAP methods generate a scale-fixed and texture-fixed perturbation map for all images, which ignores the multi-scale objects in images and usually results in a low fooling ratio. Since the widely used convolution neural networks tend to classify objects according to semantic information stored in local textures, it seems a reasonable and intuitive way to improve the UAP from the perspective of utilizing local contents effectively. In this work, we find that the fooling ratios significantly increase when we add a constraint to encourage a small-scale UAP map and repeat it vertically and horizontally to fill the whole image domain. To this end, we propose texture scale-constrained UAP (TSC-UAP), a simple yet effective UAP enhancement method that automatically generates UAPs with category-specific local textures that can fool deep models more easily. Through a low-cost operation that restricts the texture scale, TSC-UAP achieves a considerable improvement in the fooling ratio and attack transferability for both data-dependent and data-free UAP methods. Experiments conducted on two state-of-the-art UAP methods, eight popular CNN models and four classical datasets show the remarkable performance of TSC-UAP.

* 14 pages (accepted by TIFS2024)

Via

Access Paper or Ask Questions