Alert button
Picture for Sihong He

Sihong He

Alert button

Robust Electric Vehicle Balancing of Autonomous Mobility-On-Demand System: A Multi-Agent Reinforcement Learning Approach

Jul 30, 2023
Sihong He, Shuo Han, Fei Miao

Figure 1 for Robust Electric Vehicle Balancing of Autonomous Mobility-On-Demand System: A Multi-Agent Reinforcement Learning Approach
Figure 2 for Robust Electric Vehicle Balancing of Autonomous Mobility-On-Demand System: A Multi-Agent Reinforcement Learning Approach
Figure 3 for Robust Electric Vehicle Balancing of Autonomous Mobility-On-Demand System: A Multi-Agent Reinforcement Learning Approach
Figure 4 for Robust Electric Vehicle Balancing of Autonomous Mobility-On-Demand System: A Multi-Agent Reinforcement Learning Approach

Electric autonomous vehicles (EAVs) are getting attention in future autonomous mobility-on-demand (AMoD) systems due to their economic and societal benefits. However, EAVs' unique charging patterns (long charging time, high charging frequency, unpredictable charging behaviors, etc.) make it challenging to accurately predict the EAVs supply in E-AMoD systems. Furthermore, the mobility demand's prediction uncertainty makes it an urgent and challenging task to design an integrated vehicle balancing solution under supply and demand uncertainties. Despite the success of reinforcement learning-based E-AMoD balancing algorithms, state uncertainties under the EV supply or mobility demand remain unexplored. In this work, we design a multi-agent reinforcement learning (MARL)-based framework for EAVs balancing in E-AMoD systems, with adversarial agents to model both the EAVs supply and mobility demand uncertainties that may undermine the vehicle balancing solutions. We then propose a robust E-AMoD Balancing MARL (REBAMA) algorithm to train a robust EAVs balancing policy to balance both the supply-demand ratio and charging utilization rate across the whole city. Experiments show that our proposed robust method performs better compared with a non-robust MARL method that does not consider state uncertainties; it improves the reward, charging utilization fairness, and supply-demand fairness by 19.28%, 28.18%, and 3.97%, respectively. Compared with a robust optimization-based method, the proposed MARL algorithm can improve the reward, charging utilization fairness, and supply-demand fairness by 8.21%, 8.29%, and 9.42%, respectively.

* accepted to International Conference on Intelligent Robots and Systems (IROS2023) 
Viaarxiv icon

Robust Multi-Agent Reinforcement Learning with State Uncertainty

Jul 30, 2023
Sihong He, Songyang Han, Sanbao Su, Shuo Han, Shaofeng Zou, Fei Miao

Figure 1 for Robust Multi-Agent Reinforcement Learning with State Uncertainty
Figure 2 for Robust Multi-Agent Reinforcement Learning with State Uncertainty
Figure 3 for Robust Multi-Agent Reinforcement Learning with State Uncertainty
Figure 4 for Robust Multi-Agent Reinforcement Learning with State Uncertainty

In real-world multi-agent reinforcement learning (MARL) applications, agents may not have perfect state information (e.g., due to inaccurate measurement or malicious attacks), which challenges the robustness of agents' policies. Though robustness is getting important in MARL deployment, little prior work has studied state uncertainties in MARL, neither in problem formulation nor algorithm design. Motivated by this robustness issue and the lack of corresponding studies, we study the problem of MARL with state uncertainty in this work. We provide the first attempt to the theoretical and empirical analysis of this challenging problem. We first model the problem as a Markov Game with state perturbation adversaries (MG-SPA) by introducing a set of state perturbation adversaries into a Markov Game. We then introduce robust equilibrium (RE) as the solution concept of an MG-SPA. We conduct a fundamental analysis regarding MG-SPA such as giving conditions under which such a robust equilibrium exists. Then we propose a robust multi-agent Q-learning (RMAQ) algorithm to find such an equilibrium, with convergence guarantees. To handle high-dimensional state-action space, we design a robust multi-agent actor-critic (RMAAC) algorithm based on an analytical expression of the policy gradient derived in the paper. Our experiments show that the proposed RMAQ algorithm converges to the optimal value function; our RMAAC algorithm outperforms several MARL and robust MARL methods in multiple multi-agent environments when state uncertainty is present. The source code is public on \url{https://github.com/sihongho/robust_marl_with_state_uncertainty}.

* 50 pages, Published in TMLR, Transactions on Machine Learning Research (06/2023) 
Viaarxiv icon

What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Dec 07, 2022
Songyang Han, Sanbao Su, Sihong He, Shuo Han, Haizhao Yang, Fei Miao

Figure 1 for What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?
Figure 2 for What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?
Figure 3 for What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?
Figure 4 for What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Various types of Multi-Agent Reinforcement Learning (MARL) methods have been developed, assuming that agents' policies are based on true states. Recent works have improved the robustness of MARL under uncertainties from the reward, transition probability, or other partners' policies. However, in real-world multi-agent systems, state estimations may be perturbed by sensor measurement noise or even adversaries. Agents' policies trained with only true state information will deviate from optimal solutions when facing adversarial state perturbations during execution. MARL under adversarial state perturbations has limited study. Hence, in this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to study the fundamental properties of MARL under state uncertainties. We prove that the optimal agent policy and the robust Nash equilibrium do not always exist for an SAMG. Instead, we define the solution concept, robust agent policy, of the proposed SAMG under adversarial state perturbations, where agents want to maximize the worst-case expected state value. We then design a gradient descent ascent-based robust MARL algorithm to learn the robust policies for the MARL agents. Our experiments show that adversarial state perturbations decrease agents' rewards for several baselines from the existing literature, while our algorithm outperforms baselines with state perturbations and significantly improves the robustness of the MARL policies under state uncertainties.

Viaarxiv icon

Data-Driven Distributionally Robust Electric Vehicle Balancing for Autonomous Mobility-on-Demand Systems under Demand and Supply Uncertainties

Nov 24, 2022
Sihong He, Zhili Zhang, Shuo Han, Lynn Pepin, Guang Wang, Desheng Zhang, John Stankovic, Fei Miao

Figure 1 for Data-Driven Distributionally Robust Electric Vehicle Balancing for Autonomous Mobility-on-Demand Systems under Demand and Supply Uncertainties
Figure 2 for Data-Driven Distributionally Robust Electric Vehicle Balancing for Autonomous Mobility-on-Demand Systems under Demand and Supply Uncertainties
Figure 3 for Data-Driven Distributionally Robust Electric Vehicle Balancing for Autonomous Mobility-on-Demand Systems under Demand and Supply Uncertainties
Figure 4 for Data-Driven Distributionally Robust Electric Vehicle Balancing for Autonomous Mobility-on-Demand Systems under Demand and Supply Uncertainties

Electric vehicles (EVs) are being rapidly adopted due to their economic and societal benefits. Autonomous mobility-on-demand (AMoD) systems also embrace this trend. However, the long charging time and high recharging frequency of EVs pose challenges to efficiently managing EV AMoD systems. The complicated dynamic charging and mobility process of EV AMoD systems makes the demand and supply uncertainties significant when designing vehicle balancing algorithms. In this work, we design a data-driven distributionally robust optimization (DRO) approach to balance EVs for both the mobility service and the charging process. The optimization goal is to minimize the worst-case expected cost under both passenger mobility demand uncertainties and EV supply uncertainties. We then propose a novel distributional uncertainty sets construction algorithm that guarantees the produced parameters are contained in desired confidence regions with a given probability. To solve the proposed DRO AMoD EV balancing problem, we derive an equivalent computationally tractable convex optimization problem. Based on real-world EV data of a taxi system, we show that with our solution the average total balancing cost is reduced by 14.49%, and the average mobility fairness and charging fairness are improved by 15.78% and 34.51%, respectively, compared to solutions that do not consider uncertainties.

* 16 pages 
Viaarxiv icon

Data-Driven Distributionally Robust Electric Vehicle Balancing for Mobility-on-Demand Systems under Demand and Supply Uncertainties

Oct 19, 2022
Sihong He, Lynn Pepin, Guang Wang, Desheng Zhang, Fei Miao

Figure 1 for Data-Driven Distributionally Robust Electric Vehicle Balancing for Mobility-on-Demand Systems under Demand and Supply Uncertainties
Figure 2 for Data-Driven Distributionally Robust Electric Vehicle Balancing for Mobility-on-Demand Systems under Demand and Supply Uncertainties
Figure 3 for Data-Driven Distributionally Robust Electric Vehicle Balancing for Mobility-on-Demand Systems under Demand and Supply Uncertainties
Figure 4 for Data-Driven Distributionally Robust Electric Vehicle Balancing for Mobility-on-Demand Systems under Demand and Supply Uncertainties

As electric vehicle (EV) technologies become mature, EV has been rapidly adopted in modern transportation systems, and is expected to provide future autonomous mobility-on-demand (AMoD) service with economic and societal benefits. However, EVs require frequent recharges due to their limited and unpredictable cruising ranges, and they have to be managed efficiently given the dynamic charging process. It is urgent and challenging to investigate a computationally efficient algorithm that provide EV AMoD system performance guarantees under model uncertainties, instead of using heuristic demand or charging models. To accomplish this goal, this work designs a data-driven distributionally robust optimization approach for vehicle supply-demand ratio and charging station utilization balancing, while minimizing the worst-case expected cost considering both passenger mobility demand uncertainties and EV supply uncertainties. We then derive an equivalent computationally tractable form for solving the distributionally robust problem in a computationally efficient way under ellipsoid uncertainty sets constructed from data. Based on E-taxi system data of Shenzhen city, we show that the average total balancing cost is reduced by 14.49%, the average unfairness of supply-demand ratio and utilization is reduced by 15.78% and 34.51% respectively with the distributionally robust vehicle balancing method, compared with solutions which do not consider model uncertainties.

* This paper has been published in IROS2020 
Viaarxiv icon

A Robust and Constrained Multi-Agent Reinforcement Learning Framework for Electric Vehicle AMoD Systems

Sep 17, 2022
Sihong He, Yue Wang, Shuo Han, Shaofeng Zou, Fei Miao

Figure 1 for A Robust and Constrained Multi-Agent Reinforcement Learning Framework for Electric Vehicle AMoD Systems
Figure 2 for A Robust and Constrained Multi-Agent Reinforcement Learning Framework for Electric Vehicle AMoD Systems

Electric vehicles (EVs) play critical roles in autonomous mobility-on-demand (AMoD) systems, but their unique charging patterns increase the model uncertainties in AMoD systems (e.g. state transition probability). Since there usually exists a mismatch between the training and test (true) environments, incorporating model uncertainty into system design is of critical importance in real-world applications. However, model uncertainties have not been considered explicitly in EV AMoD system rebalancing by existing literature yet and remain an urgent and challenging task. In this work, we design a robust and constrained multi-agent reinforcement learning (MARL) framework with transition kernel uncertainty for the EV rebalancing and charging problem. We then propose a robust and constrained MARL algorithm (ROCOMA) that trains a robust EV rebalancing policy to balance the supply-demand ratio and the charging utilization rate across the whole city under state transition uncertainty. Experiments show that the ROCOMA can learn an effective and robust rebalancing policy. It outperforms non-robust MARL methods when there are model uncertainties. It increases the system fairness by 19.6% and decreases the rebalancing costs by 75.8%.

* 8 pages 
Viaarxiv icon

Uncertainty Quantification of Collaborative Detection for Self-Driving

Sep 16, 2022
Sanbao Su, Yiming Li, Sihong He, Songyang Han, Chen Feng, Caiwen Ding, Fei Miao

Figure 1 for Uncertainty Quantification of Collaborative Detection for Self-Driving
Figure 2 for Uncertainty Quantification of Collaborative Detection for Self-Driving
Figure 3 for Uncertainty Quantification of Collaborative Detection for Self-Driving
Figure 4 for Uncertainty Quantification of Collaborative Detection for Self-Driving

Sharing information between connected and autonomous vehicles (CAVs) fundamentally improves the performance of collaborative object detection for self-driving. However, CAVs still have uncertainties on object detection due to practical challenges, which will affect the later modules in self-driving such as planning and control. Hence, uncertainty quantification is crucial for safety-critical systems such as CAVs. Our work is the first to estimate the uncertainty of collaborative object detection. We propose a novel uncertainty quantification method, called Double-M Quantification, which tailors a moving block bootstrap (MBB) algorithm with direct modeling of the multivariant Gaussian distribution of each corner of the bounding box. Our method captures both the epistemic uncertainty and aleatoric uncertainty with one inference pass based on the offline Double-M training process. And it can be used with different collaborative object detectors. Through experiments on the comprehensive collaborative perception dataset, we show that our Double-M method achieves more than 4X improvement on uncertainty score and more than 3% accuracy improvement, compared with the state-of-the-art uncertainty quantification methods. Our code is public on https://coperception.github.io/double-m-quantification.

* 6 pages, 3 figures 
Viaarxiv icon