Alert button
Picture for Tianhao Wang

Tianhao Wang

Alert button

University of Virginia

The Marginal Value of Momentum for Small Learning Rate SGD

Jul 27, 2023
Runzhe Wang, Sadhika Malladi, Tianhao Wang, Kaifeng Lyu, Zhiyuan Li

Figure 1 for The Marginal Value of Momentum for Small Learning Rate SGD
Figure 2 for The Marginal Value of Momentum for Small Learning Rate SGD
Figure 3 for The Marginal Value of Momentum for Small Learning Rate SGD

Momentum is known to accelerate the convergence of gradient descent in strongly convex settings without stochastic gradient noise. In stochastic optimization, such as training neural networks, folklore suggests that momentum may help deep learning optimization by reducing the variance of the stochastic gradient update, but previous theoretical analyses do not find momentum to offer any provable acceleration. Theoretical results in this paper clarify the role of momentum in stochastic settings where the learning rate is small and gradient noise is the dominant source of instability, suggesting that SGD with and without momentum behave similarly in the short and long time horizons. Experiments show that momentum indeed has limited benefits for both optimization and generalization in practical training regimes where the optimal learning rate is not very large, including small- to medium-batch training from scratch on ImageNet and fine-tuning language models on downstream tasks.

Viaarxiv icon

Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems

Jul 03, 2023
Debopam Sanyal, Jui-Tse Hung, Manav Agrawal, Prahlad Jasti, Shahab Nikkhoo, Somesh Jha, Tianhao Wang, Sibin Mohan, Alexey Tumanov

Figure 1 for Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems
Figure 2 for Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems
Figure 3 for Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems
Figure 4 for Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems

With the emergence of large foundational models, model-serving systems are becoming popular. In such a system, users send the queries to the server and specify the desired performance metrics (e.g., accuracy, latency, etc.). The server maintains a set of models (model zoo) in the back-end and serves the queries based on the specified metrics. This paper examines the security, specifically robustness against model extraction attacks, of such systems. Existing black-box attacks cannot be directly applied to extract a victim model, as models hide among the model zoo behind the inference serving interface, and attackers cannot identify which model is being used. An intermediate step is required to ensure that every input query gets the output from the victim model. To this end, we propose a query-efficient fingerprinting algorithm to enable the attacker to trigger any desired model consistently. We show that by using our fingerprinting algorithm, model extraction can have fidelity and accuracy scores within $1\%$ of the scores obtained if attacking in a single-model setting and up to $14.6\%$ gain in accuracy and up to $7.7\%$ gain in fidelity compared to the naive attack. Finally, we counter the proposed attack with a noise-based defense mechanism that thwarts fingerprinting by adding noise to the specified performance metrics. Our defense strategy reduces the attack's accuracy and fidelity by up to $9.8\%$ and $4.8\%$, respectively (on medium-sized model extraction). We show that the proposed defense induces a fundamental trade-off between the level of protection and system goodput, achieving configurable and significant victim model extraction protection while maintaining acceptable goodput ($>80\%$). We provide anonymous access to our code.

* 17 pages, 9 figures 
Viaarxiv icon

Differentially Private Wireless Federated Learning Using Orthogonal Sequences

Jun 14, 2023
Xizixiang Wei, Tianhao Wang, Ruiquan Huang, Cong Shen, Jing Yang, H. Vincent Poor

Figure 1 for Differentially Private Wireless Federated Learning Using Orthogonal Sequences
Figure 2 for Differentially Private Wireless Federated Learning Using Orthogonal Sequences
Figure 3 for Differentially Private Wireless Federated Learning Using Orthogonal Sequences
Figure 4 for Differentially Private Wireless Federated Learning Using Orthogonal Sequences

We propose a novel privacy-preserving uplink over-the-air computation (AirComp) method, termed FLORAS, for single-input single-output (SISO) wireless federated learning (FL) systems. From the communication design perspective, FLORAS eliminates the requirement of channel state information at the transmitters (CSIT) by leveraging the properties of orthogonal sequences. From the privacy perspective, we prove that FLORAS can offer both item-level and client-level differential privacy (DP) guarantees. Moreover, by adjusting the system parameters, FLORAS can flexibly achieve different DP levels at no additional cost. A novel FL convergence bound is derived which, combined with the privacy guarantees, allows for a smooth tradeoff between convergence rate and differential privacy levels. Numerical results demonstrate the advantages of FLORAS compared with the baseline AirComp method, and validate that our analytical results can guide the design of privacy-preserving FL with different tradeoff requirements on the model convergence and privacy levels.

* 33 pages, 5 figures, submitted to IEEE TSP 
Viaarxiv icon

Interpreting GNN-based IDS Detections Using Provenance Graph Structural Features

Jun 06, 2023
Kunal Mukherjee, Joshua Wiedemeier, Tianhao Wang, Muhyun Kim, Feng Chen, Murat Kantarcioglu, Kangkook Jee

Figure 1 for Interpreting GNN-based IDS Detections Using Provenance Graph Structural Features
Figure 2 for Interpreting GNN-based IDS Detections Using Provenance Graph Structural Features
Figure 3 for Interpreting GNN-based IDS Detections Using Provenance Graph Structural Features
Figure 4 for Interpreting GNN-based IDS Detections Using Provenance Graph Structural Features

The black-box nature of complex Neural Network (NN)-based models has hindered their widespread adoption in security domains due to the lack of logical explanations and actionable follow-ups for their predictions. To enhance the transparency and accountability of Graph Neural Network (GNN) security models used in system provenance analysis, we propose PROVEXPLAINER, a framework for projecting abstract GNN decision boundaries onto interpretable feature spaces. We first replicate the decision-making process of GNNbased security models using simpler and explainable models such as Decision Trees (DTs). To maximize the accuracy and fidelity of the surrogate models, we propose novel graph structural features founded on classical graph theory and enhanced by extensive data study with security domain knowledge. Our graph structural features are closely tied to problem-space actions in the system provenance domain, which allows the detection results to be explained in descriptive, human language. PROVEXPLAINER allowed simple DT models to achieve 95% fidelity to the GNN on program classification tasks with general graph structural features, and 99% fidelity on malware detection tasks with a task-specific feature package tailored for direct interpretation. The explanations for malware classification are demonstrated with case studies of five real-world malware samples across three malware families.

Viaarxiv icon

Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

May 12, 2023
Yifei Min, Jiafan He, Tianhao Wang, Quanquan Gu

Figure 1 for Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation
Figure 2 for Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation
Figure 3 for Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

We study multi-agent reinforcement learning in the setting of episodic Markov decision processes, where multiple agents cooperate via communication through a central server. We propose a provably efficient algorithm based on value iteration that enable asynchronous communication while ensuring the advantage of cooperation with low communication overhead. With linear function approximation, we prove that our algorithm enjoys an $\tilde{\mathcal{O}}(d^{3/2}H^2\sqrt{K})$ regret with $\tilde{\mathcal{O}}(dHM^2)$ communication complexity, where $d$ is the feature dimension, $H$ is the horizon length, $M$ is the total number of agents, and $K$ is the total number of episodes. We also provide a lower bound showing that a minimal $\Omega(dM)$ communication complexity is required to improve the performance through collaboration.

* Published at the 40th International Conference on Machine Learning ( ICML 2023 ) 
Viaarxiv icon

Neural Lumped Parameter Differential Equations with Application in Friction-Stir Processing

Apr 18, 2023
James Koch, WoongJo Choi, Ethan King, David Garcia, Hrishikesh Das, Tianhao Wang, Ken Ross, Keerti Kappagantula

Figure 1 for Neural Lumped Parameter Differential Equations with Application in Friction-Stir Processing
Figure 2 for Neural Lumped Parameter Differential Equations with Application in Friction-Stir Processing
Figure 3 for Neural Lumped Parameter Differential Equations with Application in Friction-Stir Processing
Figure 4 for Neural Lumped Parameter Differential Equations with Application in Friction-Stir Processing

Lumped parameter methods aim to simplify the evolution of spatially-extended or continuous physical systems to that of a "lumped" element representative of the physical scales of the modeled system. For systems where the definition of a lumped element or its associated physics may be unknown, modeling tasks may be restricted to full-fidelity simulations of the physics of a system. In this work, we consider data-driven modeling tasks with limited point-wise measurements of otherwise continuous systems. We build upon the notion of the Universal Differential Equation (UDE) to construct data-driven models for reducing dynamics to that of a lumped parameter and inferring its properties. The flexibility of UDEs allow for composing various known physical priors suitable for application-specific modeling tasks, including lumped parameter methods. The motivating example for this work is the plunge and dwell stages for friction-stir welding; specifically, (i) mapping power input into the tool to a point-measurement of temperature and (ii) using this learned mapping for process control.

Viaarxiv icon

Practical Differentially Private and Byzantine-resilient Federated Learning

Apr 15, 2023
Zihang Xiang, Tianhao Wang, Wanyu Lin, Di Wang

Figure 1 for Practical Differentially Private and Byzantine-resilient Federated Learning
Figure 2 for Practical Differentially Private and Byzantine-resilient Federated Learning
Figure 3 for Practical Differentially Private and Byzantine-resilient Federated Learning
Figure 4 for Practical Differentially Private and Byzantine-resilient Federated Learning

Privacy and Byzantine resilience are two indispensable requirements for a federated learning (FL) system. Although there have been extensive studies on privacy and Byzantine security in their own track, solutions that consider both remain sparse. This is due to difficulties in reconciling privacy-preserving and Byzantine-resilient algorithms. In this work, we propose a solution to such a two-fold issue. We use our version of differentially private stochastic gradient descent (DP-SGD) algorithm to preserve privacy and then apply our Byzantine-resilient algorithms. We note that while existing works follow this general approach, an in-depth analysis on the interplay between DP and Byzantine resilience has been ignored, leading to unsatisfactory performance. Specifically, for the random noise introduced by DP, previous works strive to reduce its impact on the Byzantine aggregation. In contrast, we leverage the random noise to construct an aggregation that effectively rejects many existing Byzantine attacks. We provide both theoretical proof and empirical experiments to show our protocol is effective: retaining high accuracy while preserving the DP guarantee and Byzantine resilience. Compared with the previous work, our protocol 1) achieves significantly higher accuracy even in a high privacy regime; 2) works well even when up to 90% of distributive workers are Byzantine.

Viaarxiv icon

FACE-AUDITOR: Data Auditing in Facial Recognition Systems

Apr 05, 2023
Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Yang Zhang

Figure 1 for FACE-AUDITOR: Data Auditing in Facial Recognition Systems
Figure 2 for FACE-AUDITOR: Data Auditing in Facial Recognition Systems
Figure 3 for FACE-AUDITOR: Data Auditing in Facial Recognition Systems
Figure 4 for FACE-AUDITOR: Data Auditing in Facial Recognition Systems

Few-shot-based facial recognition systems have gained increasing attention due to their scalability and ability to work with a few face images during the model deployment phase. However, the power of facial recognition systems enables entities with moderate resources to canvas the Internet and build well-performed facial recognition models without people's awareness and consent. To prevent the face images from being misused, one straightforward approach is to modify the raw face images before sharing them, which inevitably destroys the semantic information, increases the difficulty of retroactivity, and is still prone to adaptive attacks. Therefore, an auditing method that does not interfere with the facial recognition model's utility and cannot be quickly bypassed is urgently needed. In this paper, we formulate the auditing process as a user-level membership inference problem and propose a complete toolkit FACE-AUDITOR that can carefully choose the probing set to query the few-shot-based facial recognition model and determine whether any of a user's face images is used in training the model. We further propose to use the similarity scores between the original face images as reference information to improve the auditing performance. Extensive experiments on multiple real-world face image datasets show that FACE-AUDITOR can achieve auditing accuracy of up to $99\%$. Finally, we show that FACE-AUDITOR is robust in the presence of several perturbation mechanisms to the training images or the target models. The source code of our experiments can be found at \url{https://github.com/MinChen00/Face-Auditor}.

* To appear in the 32nd USENIX Security Symposium, August 2023, Anaheim, CA, USA 
Viaarxiv icon

GlucoSynth: Generating Differentially-Private Synthetic Glucose Traces

Mar 02, 2023
Josephine Lamp, Mark Derdzinski, Christopher Hannemann, Joost van der Linden, Lu Feng, Tianhao Wang, David Evans

Figure 1 for GlucoSynth: Generating Differentially-Private Synthetic Glucose Traces
Figure 2 for GlucoSynth: Generating Differentially-Private Synthetic Glucose Traces
Figure 3 for GlucoSynth: Generating Differentially-Private Synthetic Glucose Traces
Figure 4 for GlucoSynth: Generating Differentially-Private Synthetic Glucose Traces

In this paper we focus on the problem of generating high-quality, private synthetic glucose traces, a task generalizable to many other time series sources. Existing methods for time series data synthesis, such as those using Generative Adversarial Networks (GANs), are not able to capture the innate characteristics of glucose data and, in terms of privacy, either do not include any formal privacy guarantees or, in order to uphold a strong formal privacy guarantee, severely degrade the utility of the synthetic data. Therefore, in this paper we present GlucoSynth, a novel privacy-preserving GAN framework to generate synthetic glucose traces. The core intuition in our approach is to conserve relationships amongst motifs (glucose events) within the traces, in addition to typical temporal dynamics. Moreover, we integrate differential privacy into the framework to provide strong formal privacy guarantees. Finally, we provide a comprehensive evaluation on the real-world utility of the data using 1.2 million glucose traces

Viaarxiv icon

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning

Feb 24, 2023
Ruitu Xu, Yifei Min, Tianhao Wang, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang

Figure 1 for Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning
Figure 2 for Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning

We study a heterogeneous agent macroeconomic model with an infinite number of households and firms competing in a labor market. Each household earns income and engages in consumption at each time step while aiming to maximize a concave utility subject to the underlying market conditions. The households aim to find the optimal saving strategy that maximizes their discounted cumulative utility given the market condition, while the firms determine the market conditions through maximizing corporate profit based on the household population behavior. The model captures a wide range of applications in macroeconomic studies, and we propose a data-driven reinforcement learning framework that finds the regularized competitive equilibrium of the model. The proposed algorithm enjoys theoretical guarantees in converging to the equilibrium of the market at a sub-linear rate.

* 44 pages 
Viaarxiv icon