Alert button
Picture for Mingyan Liu

Mingyan Liu

Alert button

*co-first authors

Federated Learning with Reduced Information Leakage and Computation

Oct 10, 2023
Tongxin Yin, Xueru Zhang, Mohammad Mahdi Khalili, Mingyan Liu

Figure 1 for Federated Learning with Reduced Information Leakage and Computation
Figure 2 for Federated Learning with Reduced Information Leakage and Computation
Figure 3 for Federated Learning with Reduced Information Leakage and Computation
Figure 4 for Federated Learning with Reduced Information Leakage and Computation

Federated learning (FL) is a distributed learning paradigm that allows multiple decentralized clients to collaboratively learn a common model without sharing local data. Although local data is not exposed directly, privacy concerns nonetheless exist as clients' sensitive information can be inferred from intermediate computations. Moreover, such information leakage accumulates substantially over time as the same data is repeatedly used during the iterative learning process. As a result, it can be particularly difficult to balance the privacy-accuracy trade-off when designing privacy-preserving FL algorithms. In this paper, we introduce Upcycled-FL, a novel federated learning framework with first-order approximation applied at every even iteration. Under this framework, half of the FL updates incur no information leakage and require much less computation. We first conduct the theoretical analysis on the convergence (rate) of Upcycled-FL, and then apply perturbation mechanisms to preserve privacy. Experiments on real-world data show that Upcycled-FL consistently outperforms existing methods over heterogeneous data, and significantly improves privacy-accuracy trade-off while reducing 48% of the training time on average.

Viaarxiv icon

Fair Classifiers that Abstain without Harm

Oct 09, 2023
Tongxin Yin, Jean-François Ton, Ruocheng Guo, Yuanshun Yao, Mingyan Liu, Yang Liu

In critical applications, it is vital for classifiers to defer decision-making to humans. We propose a post-hoc method that makes existing classifiers selectively abstain from predicting certain samples. Our abstaining classifier is incentivized to maintain the original accuracy for each sub-population (i.e. no harm) while achieving a set of group fairness definitions to a user specified degree. To this end, we design an Integer Programming (IP) procedure that assigns abstention decisions for each training sample to satisfy a set of constraints. To generalize the abstaining decisions to test samples, we then train a surrogate model to learn the abstaining decisions based on the IP solutions in an end-to-end manner. We analyze the feasibility of the IP procedure to determine the possible abstention rate for different levels of unfairness tolerance and accuracy constraint for achieving no harm. To the best of our knowledge, this work is the first to identify the theoretical relationships between the constraint parameters and the required abstention rate. Our theoretical results are important since a high abstention rate is often infeasible in practice due to a lack of human resources. Our framework outperforms existing methods in terms of fairness disparity without sacrificing accuracy at similar abstention rates.

Viaarxiv icon

Professional Basketball Player Behavior Synthesis via Planning with Diffusion

Jun 10, 2023
Xiusi Chen, Wei-Yao Wang, Ziniu Hu, Curtis Chou, Lam Hoang, Kun Jin, Mingyan Liu, P. Jeffrey Brantingham, Wei Wang

Figure 1 for Professional Basketball Player Behavior Synthesis via Planning with Diffusion
Figure 2 for Professional Basketball Player Behavior Synthesis via Planning with Diffusion
Figure 3 for Professional Basketball Player Behavior Synthesis via Planning with Diffusion
Figure 4 for Professional Basketball Player Behavior Synthesis via Planning with Diffusion

Dynamically planning in multi-agent systems has been explored to improve decision-making in various domains. Professional basketball serves as a compelling example of a dynamic spatio-temporal game, encompassing both concealed strategic policies and decision-making. However, processing the diverse on-court signals and navigating the vast space of potential actions and outcomes makes it difficult for existing approaches to swiftly identify optimal strategies in response to evolving circumstances. In this study, we first formulate the sequential decision-making process as a conditional trajectory generation process. We further introduce PLAYBEST (PLAYer BEhavior SynThesis), a method for enhancing player decision-making. We extend the state-of-the-art generative model, diffusion probabilistic model, to learn challenging multi-agent environmental dynamics from historical National Basketball Association (NBA) player motion tracking data. To incorporate data-driven strategies, an auxiliary value function is trained using the play-by-play data with corresponding rewards acting as the plan guidance. To accomplish reward-guided trajectory generation, conditional sampling is introduced to condition the diffusion model on the value function and conduct classifier-guided sampling. We validate the effectiveness of PLAYBEST via comprehensive simulation studies from real-world data, contrasting the generated trajectories and play strategies with those employed by professional basketball teams. Our results reveal that the model excels at generating high-quality basketball trajectories that yield efficient plays, surpassing conventional planning techniques in terms of adaptability, flexibility, and overall performance. Moreover, the synthesized play strategies exhibit a remarkable alignment with professional tactics, highlighting the model's capacity to capture the intricate dynamics of basketball games.

Viaarxiv icon

Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts

May 08, 2023
Kun Jin, Tongxin Yin, Zhongzhu Chen, Zeyu Sun, Xueru Zhang, Yang Liu, Mingyan Liu

Figure 1 for Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts
Figure 2 for Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts
Figure 3 for Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts
Figure 4 for Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts

We consider a federated learning (FL) system consisting of multiple clients and a server, where the clients aim to collaboratively learn a common decision model from their distributed data. Unlike the conventional FL framework that assumes the client's data is static, we consider scenarios where the clients' data distributions may be reshaped by the deployed decision model. In this work, we leverage the idea of distribution shift mappings in performative prediction to formalize this model-dependent data distribution shift and propose a performative federated learning framework. We first introduce necessary and sufficient conditions for the existence of a unique performative stable solution and characterize its distance to the performative optimal solution. Then we propose the performative FedAvg algorithm and show that it converges to the performative stable solution at a rate of O(1/T) under both full and partial participation schemes. In particular, we use novel proof techniques and show how the clients' heterogeneity influences the convergence. Numerical results validate our analysis and provide valuable insights into real-world applications.

Viaarxiv icon

Long-Term Fairness with Unknown Dynamics

Apr 19, 2023
Tongxin Yin, Reilly Raab, Mingyan Liu, Yang Liu

Figure 1 for Long-Term Fairness with Unknown Dynamics
Figure 2 for Long-Term Fairness with Unknown Dynamics
Figure 3 for Long-Term Fairness with Unknown Dynamics
Figure 4 for Long-Term Fairness with Unknown Dynamics

While machine learning can myopically reinforce social inequalities, it may also be used to dynamically seek equitable outcomes. In this paper, we formalize long-term fairness in the context of online reinforcement learning. This formulation can accommodate dynamical control objectives, such as driving equity inherent in the state of a population, that cannot be incorporated into static formulations of fairness. We demonstrate that this framing allows an algorithm to adapt to unknown dynamics by sacrificing short-term incentives to drive a classifier-population system towards more desirable equilibria. For the proposed setting, we develop an algorithm that adapts recent work in online learning. We prove that this algorithm achieves simultaneous probabilistic bounds on cumulative loss and cumulative violations of fairness (as statistical regularities between demographic groups). We compare our proposed algorithm to the repeated retraining of myopic classifiers, as a baseline, and to a deep reinforcement learning algorithm that lacks safety guarantees. Our experiments model human populations according to evolutionary game theory and integrate real-world datasets.

Viaarxiv icon

DensePure: Understanding Diffusion Models towards Adversarial Robustness

Nov 01, 2022
Chaowei Xiao, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Weili Nie, Mingyan Liu, Anima Anandkumar, Bo Li, Dawn Song

Figure 1 for DensePure: Understanding Diffusion Models towards Adversarial Robustness
Figure 2 for DensePure: Understanding Diffusion Models towards Adversarial Robustness
Figure 3 for DensePure: Understanding Diffusion Models towards Adversarial Robustness
Figure 4 for DensePure: Understanding Diffusion Models towards Adversarial Robustness

Diffusion models have been recently employed to improve certified robustness through the process of denoising. However, the theoretical understanding of why diffusion models are able to improve the certified robustness is still lacking, preventing from further improvement. In this study, we close this gap by analyzing the fundamental properties of diffusion models and establishing the conditions under which they can enhance certified robustness. This deeper understanding allows us to propose a new method DensePure, designed to improve the certified robustness of a pretrained model (i.e. classifier). Given an (adversarial) input, DensePure consists of multiple runs of denoising via the reverse process of the diffusion model (with different random seeds) to get multiple reversed samples, which are then passed through the classifier, followed by majority voting of inferred labels to make the final prediction. This design of using multiple runs of denoising is informed by our theoretical analysis of the conditional distribution of the reversed sample. Specifically, when the data density of a clean sample is high, its conditional density under the reverse process in a diffusion model is also high; thus sampling from the latter conditional distribution can purify the adversarial example and return the corresponding clean sample with a high probability. By using the highest density point in the conditional distribution as the reversed sample, we identify the robust region of a given instance under the diffusion model's reverse process. We show that this robust region is a union of multiple convex sets, and is potentially much larger than the robust regions identified in previous works. In practice, DensePure can approximate the label of the high density region in the conditional distribution so that it can enhance certified robustness.

Viaarxiv icon

Invisible for both Camera and LiDAR: Security of Multi-Sensor Fusion based Perception in Autonomous Driving Under Physical-World Attacks

Jun 17, 2021
Yulong Cao*, Ningfei Wang*, Chaowei Xiao*, Dawei Yang*, Jin Fang, Ruigang Yang, Qi Alfred Chen, Mingyan Liu, Bo Li

Figure 1 for Invisible for both Camera and LiDAR: Security of Multi-Sensor Fusion based Perception in Autonomous Driving Under Physical-World Attacks
Figure 2 for Invisible for both Camera and LiDAR: Security of Multi-Sensor Fusion based Perception in Autonomous Driving Under Physical-World Attacks
Figure 3 for Invisible for both Camera and LiDAR: Security of Multi-Sensor Fusion based Perception in Autonomous Driving Under Physical-World Attacks
Figure 4 for Invisible for both Camera and LiDAR: Security of Multi-Sensor Fusion based Perception in Autonomous Driving Under Physical-World Attacks

In Autonomous Driving (AD) systems, perception is both security and safety critical. Despite various prior studies on its security issues, all of them only consider attacks on camera- or LiDAR-based AD perception alone. However, production AD systems today predominantly adopt a Multi-Sensor Fusion (MSF) based design, which in principle can be more robust against these attacks under the assumption that not all fusion sources are (or can be) attacked at the same time. In this paper, we present the first study of security issues of MSF-based perception in AD systems. We directly challenge the basic MSF design assumption above by exploring the possibility of attacking all fusion sources simultaneously. This allows us for the first time to understand how much security guarantee MSF can fundamentally provide as a general defense strategy for AD perception. We formulate the attack as an optimization problem to generate a physically-realizable, adversarial 3D-printed object that misleads an AD system to fail in detecting it and thus crash into it. We propose a novel attack pipeline that addresses two main design challenges: (1) non-differentiable target camera and LiDAR sensing systems, and (2) non-differentiable cell-level aggregated features popularly used in LiDAR-based AD perception. We evaluate our attack on MSF included in representative open-source industry-grade AD systems in real-world driving scenarios. Our results show that the attack achieves over 90% success rate across different object types and MSF. Our attack is also found stealthy, robust to victim positions, transferable across MSF algorithms, and physical-world realizable after being 3D-printed and captured by LiDAR and camera devices. To concretely assess the end-to-end safety impact, we further perform simulation evaluation and show that it can cause a 100% vehicle collision rate for an industry-grade AD system.

* Accepted by IEEE S&P 2021 
Viaarxiv icon

How Do Fair Decisions Fare in Long-term Qualification?

Oct 21, 2020
Xueru Zhang, Ruibo Tu, Yang Liu, Mingyan Liu, Hedvig Kjellström, Kun Zhang, Cheng Zhang

Figure 1 for How Do Fair Decisions Fare in Long-term Qualification?
Figure 2 for How Do Fair Decisions Fare in Long-term Qualification?
Figure 3 for How Do Fair Decisions Fare in Long-term Qualification?
Figure 4 for How Do Fair Decisions Fare in Long-term Qualification?

Although many fairness criteria have been proposed for decision making, their long-term impact on the well-being of a population remains unclear. In this work, we study the dynamics of population qualification and algorithmic decisions under a partially observed Markov decision problem setting. By characterizing the equilibrium of such dynamics, we analyze the long-term impact of static fairness constraints on the equality and improvement of group well-being. Our results show that static fairness constraints can either promote equality or exacerbate disparity depending on the driving factor of qualification transitions and the effect of sensitive attributes on feature distributions. We also consider possible interventions that can effectively improve group qualification or promote equality of group qualification. Our theoretical results and experiments on static real-world datasets with simulated dynamics show that our framework can be used to facilitate social science studies.

* Accepted to the 34th Conference on Neural Information Processing Systems (NeurIPS) 
Viaarxiv icon

Fairness in Learning-Based Sequential Decision Algorithms: A Survey

Jan 14, 2020
Xueru Zhang, Mingyan Liu

Figure 1 for Fairness in Learning-Based Sequential Decision Algorithms: A Survey
Figure 2 for Fairness in Learning-Based Sequential Decision Algorithms: A Survey

Algorithmic fairness in decision-making has been studied extensively in static settings where one-shot decisions are made on tasks such as classification. However, in practice most decision-making processes are of a sequential nature, where decisions made in the past may have an impact on future data. This is particularly the case when decisions affect the individuals or users generating the data used for future decisions. In this survey, we review existing literature on the fairness of data-driven sequential decision-making. We will focus on two types of sequential decisions: (1) past decisions have no impact on the underlying user population and thus no impact on future data; (2) past decisions have an impact on the underlying user population and therefore the future data, which can then impact future decisions. In each case the impact of various fairness interventions on the underlying population is examined.

Viaarxiv icon

Recycled ADMM: Improving the Privacy and Accuracy of Distributed Algorithms

Oct 08, 2019
Xueru Zhang, Mohammad Mahdi Khalili, Mingyan Liu

Figure 1 for Recycled ADMM: Improving the Privacy and Accuracy of Distributed Algorithms
Figure 2 for Recycled ADMM: Improving the Privacy and Accuracy of Distributed Algorithms
Figure 3 for Recycled ADMM: Improving the Privacy and Accuracy of Distributed Algorithms
Figure 4 for Recycled ADMM: Improving the Privacy and Accuracy of Distributed Algorithms

Alternating direction method of multiplier (ADMM) is a powerful method to solve decentralized convex optimization problems. In distributed settings, each node performs computation with its local data and the local results are exchanged among neighboring nodes in an iterative fashion. During this iterative process the leakage of data privacy arises and can accumulate significantly over many iterations, making it difficult to balance the privacy-accuracy tradeoff. We propose Recycled ADMM (R-ADMM), where a linear approximation is applied to every even iteration, its solution directly calculated using only results from the previous, odd iteration. It turns out that under such a scheme, half of the updates incur no privacy loss and require much less computation compared to the conventional ADMM. Moreover, R-ADMM can be further modified (MR-ADMM) such that each node independently determines its own penalty parameter over iterations. We obtain a sufficient condition for the convergence of both algorithms and provide the privacy analysis based on objective perturbation. It can be shown that the privacy-accuracy tradeoff can be improved significantly compared with conventional ADMM.

* In IEEE Transactions on Information Forensics and Security (TIFS), 2019  
* arXiv admin note: substantial text overlap with arXiv:1810.03197 
Viaarxiv icon