Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenhao Ding

Your Room is not Private: Gradient Inversion Attack for Deep Q-Learning

Jun 15, 2023

Miao Li, Wenhao Ding, Ding Zhao

Figure 1 for Your Room is not Private: Gradient Inversion Attack for Deep Q-Learning

Figure 2 for Your Room is not Private: Gradient Inversion Attack for Deep Q-Learning

Figure 3 for Your Room is not Private: Gradient Inversion Attack for Deep Q-Learning

Figure 4 for Your Room is not Private: Gradient Inversion Attack for Deep Q-Learning

Abstract:The prominence of embodied Artificial Intelligence (AI), which empowers robots to navigate, perceive, and engage within virtual environments, has attracted significant attention, owing to the remarkable advancements in computer vision and large language models. Privacy emerges as a pivotal concern within the realm of embodied AI, as the robot access substantial personal information. However, the issue of privacy leakage in embodied AI tasks, particularly in relation to decision-making algorithms, has not received adequate consideration in research. This paper aims to address this gap by proposing an attack on the Deep Q-Learning algorithm, utilizing gradient inversion to reconstruct states, actions, and Q-values. The choice of using gradients for the attack is motivated by the fact that commonly employed federated learning techniques solely utilize gradients computed based on private user data to optimize models, without storing or transmitting the data to public servers. Nevertheless, these gradients contain sufficient information to potentially expose private data. To validate our approach, we conduct experiments on the AI2THOR simulator and evaluate our algorithm on active perception, a prevalent task in embodied AI. The experimental results convincingly demonstrate the effectiveness of our method in successfully recovering all information from the data across all 120 room layouts.

* 15 pages, 9 figures

Via

Access Paper or Ask Questions

Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models

May 18, 2023

Wenhao Ding, Tong Che, Ding Zhao, Marco Pavone

Figure 1 for Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models

Figure 2 for Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models

Figure 3 for Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models

Figure 4 for Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models

Abstract:Recently, reward-conditioned reinforcement learning (RCRL) has gained popularity due to its simplicity, flexibility, and off-policy nature. However, we will show that current RCRL approaches are fundamentally limited and fail to address two critical challenges of RCRL -- improving generalization on high reward-to-go (RTG) inputs, and avoiding out-of-distribution (OOD) RTG queries during testing time. To address these challenges when training vanilla RCRL architectures, we propose Bayesian Reparameterized RCRL (BR-RCRL), a novel set of inductive biases for RCRL inspired by Bayes' theorem. BR-RCRL removes a core obstacle preventing vanilla RCRL from generalizing on high RTG inputs -- a tendency that the model treats different RTG inputs as independent values, which we term ``RTG Independence". BR-RCRL also allows us to design an accompanying adaptive inference method, which maximizes total returns while avoiding OOD queries that yield unpredictable behaviors in vanilla RCRL methods. We show that BR-RCRL achieves state-of-the-art performance on the Gym-Mujoco and Atari offline RL benchmarks, improving upon vanilla RCRL by up to 11%.

* Accepted to ICML 2023

Via

Access Paper or Ask Questions

Critical Scenario Generation for Developing Trustworthy Autonomy

Apr 29, 2023

Wenhao Ding

Abstract:Autonomous systems, such as self-driving vehicles, quadrupeds, and robot manipulators, are largely enabled by the rapid development of artificial intelligence. However, such systems involve several trustworthy challenges such as safety, robustness, and generalization, due to their deployment in open-ended and real-time environments. To evaluate and improve trustworthiness, simulations or so-called digital twins are largely utilized for system development with low cost and high efficiency. One important thing in virtual simulations is scenarios that consist of static and dynamic objects, specific tasks, and evaluation metrics. However, designing diverse, realistic, and effective scenarios is still a challenging problem. One straightforward way is creating scenarios through human design, which is time-consuming and limited by the experience of experts. Another method commonly used in self-driving areas is log replay. This method collects scenario data in the real world and then replays it in simulations or adds random perturbations. Although the replay scenarios are realistic, most of the collected scenarios are redundant since they are all ordinary scenarios that only consider a small portion of critical cases. The desired scenarios should cover all cases in the real world, especially rare but critical events with extremely low probability. Critical scenarios are rare but important to test autonomous systems under risky conditions and unpredictable perturbations, which reveal their trustworthiness.

* research statement

Via

Access Paper or Ask Questions

Learning to View: Decision Transformers for Active Object Detection

Jan 23, 2023

Wenhao Ding, Nathalie Majcherczyk, Mohit Deshpande, Xuewei Qi, Ding Zhao, Rajasimman Madhivanan, Arnie Sen

Figure 1 for Learning to View: Decision Transformers for Active Object Detection

Figure 2 for Learning to View: Decision Transformers for Active Object Detection

Figure 3 for Learning to View: Decision Transformers for Active Object Detection

Figure 4 for Learning to View: Decision Transformers for Active Object Detection

Abstract:Active perception describes a broad class of techniques that couple planning and perception systems to move the robot in a way to give the robot more information about the environment. In most robotic systems, perception is typically independent of motion planning. For example, traditional object detection is passive: it operates only on the images it receives. However, we have a chance to improve the results if we allow planning to consume detection signals and move the robot to collect views that maximize the quality of the results. In this paper, we use reinforcement learning (RL) methods to control the robot in order to obtain images that maximize the detection quality. Specifically, we propose using a Decision Transformer with online fine-tuning, which first optimizes the policy with a pre-collected expert dataset and then improves the learned policy by exploring better solutions in the environment. We evaluate the performance of proposed method on an interactive dataset collected from an indoor scenario simulator. Experimental results demonstrate that our method outperforms all baselines, including expert policy and pure offline RL methods. We also provide exhaustive analyses of the reward distribution and observation space.

* Accepted to ICRA 2023

Via

Access Paper or Ask Questions

Solving practical multi-body dynamics problems using a single neural operator

Oct 01, 2022

Wenhao Ding, Qing He, Hanghang Tong, Qingjing Wang, Ping Wang

Figure 1 for Solving practical multi-body dynamics problems using a single neural operator

Figure 2 for Solving practical multi-body dynamics problems using a single neural operator

Figure 3 for Solving practical multi-body dynamics problems using a single neural operator

Figure 4 for Solving practical multi-body dynamics problems using a single neural operator

Abstract:As a fundamental design tool in many engineering disciplines, multi-body dynamics (MBD) models a complex structure with a differential equation group containing multiple physical quantities. Engineers must constantly adjust structures at the design stage, which requires a highly efficient solver. The rise of deep learning technologies has offered new perspectives on MBD. Unfortunately, existing black-box models suffer from poor accuracy and robustness, while the advanced methodologies of single-output operator regression cannot deal with multiple quantities simultaneously. To address these challenges, we propose PINO-MBD, a deep learning framework for solving practical MBD problems based on the theory of physics-informed neural operator (PINO). PINO-MBD uses a single network for all quantities in a multi-body system, instead of training dozens, or even hundreds of networks as in the existing literature. We demonstrate the flexibility and feasibility of PINO-MBD for one toy example and two practical applications: vehicle-track coupled dynamics (VTCD) and reliability analysis of a four-storey building. The performance of VTCD indicates that our framework outperforms existing software and machine learning-based methods in terms of efficiency and precision, respectively. For the reliability analysis, PINO-MBD can provide higher-resolution results in less than a quarter of the time incurred when using the probability density evolution method (PDEM). This framework integrates mechanics and deep learning technologies and may reveal a new concept for MBD and probabilistic engineering.

Via

Access Paper or Ask Questions

Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability

Sep 16, 2022

Mengdi Xu, Zuxin Liu, Peide Huang, Wenhao Ding, Zhepeng Cen, Bo Li, Ding Zhao

Figure 1 for Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability

Figure 2 for Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability

Figure 3 for Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability

Figure 4 for Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability

Abstract:A trustworthy reinforcement learning algorithm should be competent in solving challenging real-world problems, including {robustly} handling uncertainties, satisfying {safety} constraints to avoid catastrophic failures, and {generalizing} to unseen scenarios during deployments. This study aims to overview these main perspectives of trustworthy reinforcement learning considering its intrinsic vulnerabilities on robustness, safety, and generalizability. In particular, we give rigorous formulations, categorize corresponding methodologies, and discuss benchmarks for each perspective. Moreover, we provide an outlook section to spur promising future directions with a brief discussion on extrinsic vulnerabilities considering human feedback. We hope this survey could bring together separate threads of studies together in a unified framework and promote the trustworthiness of reinforcement learning.

* 36 pages, 5 figures

Via

Access Paper or Ask Questions

Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

Jul 19, 2022

Wenhao Ding, Haohong Lin, Bo Li, Ding Zhao

Figure 1 for Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

Figure 2 for Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

Figure 3 for Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

Figure 4 for Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

Abstract:As a pivotal component to attaining generalizable solutions in human intelligence, reasoning provides great potential for reinforcement learning (RL) agents' generalization towards varied goals by summarizing part-to-whole arguments and discovering cause-and-effect relations. However, how to discover and represent causalities remains a huge gap that hinders the development of causal RL. In this paper, we augment Goal-Conditioned RL (GCRL) with Causal Graph (CG), a structure built upon the relation between objects and events. We novelly formulate the GCRL problem into variational likelihood maximization with CG as latent variables. To optimize the derived objective, we propose a framework with theoretical performance guarantees that alternates between two steps: using interventional data to estimate the posterior of CG; using CG to learn generalizable models and interpretable policies. Due to the lack of public benchmarks that verify generalization capability under reasoning, we design nine tasks and then empirically show the effectiveness of the proposed method against five baselines on these tasks. Further theoretical analysis shows that our performance improvement is attributed to the virtuous cycle of causal discovery, transition modeling, and policy training, which aligns with the experimental evidence in extensive ablation studies.

* 28 pages, 5 figures, under review

Via

Access Paper or Ask Questions

SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

Jun 20, 2022

Chejian Xu, Wenhao Ding, Weijie Lyu, Zuxin Liu, Shuai Wang, Yihan He, Hanjiang Hu, Ding Zhao, Bo Li

Figure 1 for SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

Figure 2 for SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

Figure 3 for SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

Figure 4 for SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

Abstract:As shown by recent studies, machine intelligence-enabled systems are vulnerable to test cases resulting from either adversarial manipulation or natural distribution shifts. This has raised great concerns about deploying machine learning algorithms for real-world applications, especially in the safety-critical domains such as autonomous driving (AD). On the other hand, traditional AD testing on naturalistic scenarios requires hundreds of millions of driving miles due to the high dimensionality and rareness of the safety-critical scenarios in the real world. As a result, several approaches for autonomous driving evaluation have been explored, which are usually, however, based on different simulation platforms, types of safety-critical scenarios, scenario generation algorithms, and driving route variations. Thus, despite a large amount of effort in autonomous driving testing, it is still challenging to compare and understand the effectiveness and efficiency of different testing scenario generation algorithms and testing mechanisms under similar conditions. In this paper, we aim to provide the first unified platform SafeBench to integrate different types of safety-critical testing scenarios, scenario generation algorithms, and other variations such as driving routes and environments. Meanwhile, we implement 4 deep reinforcement learning-based AD algorithms with 4 types of input (e.g., bird's-eye view, camera) to perform fair comparisons on SafeBench. We find our generated testing scenarios are indeed more challenging and observe the trade-off between the performance of AD agents under benign and safety-critical testing scenarios. We believe our unified platform SafeBench for large-scale and effective autonomous driving testing will motivate the development of new testing scenario generation and safe AD algorithms. SafeBench is available at https://safebench.github.io.

Via

Access Paper or Ask Questions

A Survey on Safety-Critical Scenario Generation for Autonomous Driving -- A Methodological Perspective

Feb 07, 2022

Wenhao Ding, Chejian Xu, Haohong Lin, Bo Li, Ding Zhao

Figure 1 for A Survey on Safety-Critical Scenario Generation for Autonomous Driving -- A Methodological Perspective

Figure 2 for A Survey on Safety-Critical Scenario Generation for Autonomous Driving -- A Methodological Perspective

Figure 3 for A Survey on Safety-Critical Scenario Generation for Autonomous Driving -- A Methodological Perspective

Figure 4 for A Survey on Safety-Critical Scenario Generation for Autonomous Driving -- A Methodological Perspective

Abstract:Autonomous driving systems have witnessed a great development during the past years thanks to the advance in sensing and decision-making. One critical obstacle for their massive deployment in the real world is the evaluation of safety. Most existing driving systems are still trained and evaluated on naturalistic scenarios that account for the vast majority of daily life or heuristically-generated adversarial ones. However, the large population of cars requires an extremely low collision rate, indicating safety-critical scenarios collected in the real world would be rare. Thus, methods to artificially generate artificial scenarios becomes critical to manage the risk and reduce the cost. In this survey, we focus on the algorithms of safety-critical scenario generation. We firstly provide a comprehensive taxonomy of existing algorithms by dividing them into three categories: data-driven generation, adversarial generation, and knowledge-based generation. Then, we discuss useful tools for scenario generation, including simulation platforms and packages. Finally, we extend our discussion to five main challenges of current works -- fidelity, efficiency, diversity, transferability, controllability -- and the research opportunities lighted up by these challenges.

* 16 pages, 4 figures

Via

Access Paper or Ask Questions

Certifiable Deep Importance Sampling for Rare-Event Simulation of Black-Box Systems

Nov 03, 2021

Mansur Arief, Yuanlu Bai, Wenhao Ding, Shengyi He, Zhiyuan Huang, Henry Lam, Ding Zhao

Figure 1 for Certifiable Deep Importance Sampling for Rare-Event Simulation of Black-Box Systems

Figure 2 for Certifiable Deep Importance Sampling for Rare-Event Simulation of Black-Box Systems

Figure 3 for Certifiable Deep Importance Sampling for Rare-Event Simulation of Black-Box Systems

Figure 4 for Certifiable Deep Importance Sampling for Rare-Event Simulation of Black-Box Systems

Abstract:Rare-event simulation techniques, such as importance sampling (IS), constitute powerful tools to speed up challenging estimation of rare catastrophic events. These techniques often leverage the knowledge and analysis on underlying system structures to endow desirable efficiency guarantees. However, black-box problems, especially those arising from recent safety-critical applications of AI-driven physical systems, can fundamentally undermine their efficiency guarantees and lead to dangerous under-estimation without diagnostically detected. We propose a framework called Deep Probabilistic Accelerated Evaluation (Deep-PrAE) to design statistically guaranteed IS, by converting black-box samplers that are versatile but could lack guarantees, into one with what we call a relaxed efficiency certificate that allows accurate estimation of bounds on the rare-event probability. We present the theory of Deep-PrAE that combines the dominating point concept with rare-event set learning via deep neural network classifiers, and demonstrate its effectiveness in numerical examples including the safety-testing of intelligent driving algorithms.

* The conference version of this paper has appeared in AISTATS 2021 (arXiv:2006.15722)

Via

Access Paper or Ask Questions