Alert button
Picture for Qingrui Zhang

Qingrui Zhang

Alert button

Formation Control for Moving Target Enclosing via Relative Localization

Jul 28, 2023
Xueming Liu, Kunda Liu, Tianjiang Hu, Qingrui Zhang

In this paper, we investigate the problem of controlling multiple unmanned aerial vehicles (UAVs) to enclose a moving target in a distributed fashion based on a relative distance and self-displacement measurements. A relative localization technique is developed based on the recursive least square estimation (RLSE) technique with a forgetting factor to estimates both the ``UAV-UAV'' and ``UAV-target'' relative positions. The formation enclosing motion is planned using a coupled oscillator model, which generates desired motion for UAVs to distribute evenly on a circle. The coupled-oscillator-based motion can also facilitate the exponential convergence of relative localization due to its persistent excitation nature. Based on the generation strategy of desired formation pattern and relative localization estimates, a cooperative formation tracking control scheme is proposed, which enables the formation geometric center to asymptotically converge to the moving target. The asymptotic convergence performance is analyzed theoretically for both the relative localization technique and the formation control algorithm. Numerical simulations are provided to show the efficiency of the proposed algorithm. Experiments with three quadrotors tracking one target are conducted to evaluate the proposed target enclosing method in real platforms.

* 8 Pages, accepted by IEEE CDC 2023 
Viaarxiv icon

Reinforced Potential Field for Multi-Robot Motion Planning in Cluttered Environments

Jul 26, 2023
Dengyu Zhang, Xinyu Zhang, Zheng Zhang, Bo Zhu, Qingrui Zhang

Figure 1 for Reinforced Potential Field for Multi-Robot Motion Planning in Cluttered Environments
Figure 2 for Reinforced Potential Field for Multi-Robot Motion Planning in Cluttered Environments
Figure 3 for Reinforced Potential Field for Multi-Robot Motion Planning in Cluttered Environments
Figure 4 for Reinforced Potential Field for Multi-Robot Motion Planning in Cluttered Environments

Motion planning is challenging for multiple robots in cluttered environments without communication, especially in view of real-time efficiency, motion safety, distributed computation, and trajectory optimality, etc. In this paper, a reinforced potential field method is developed for distributed multi-robot motion planning, which is a synthesized design of reinforcement learning and artificial potential fields. An observation embedding with a self-attention mechanism is presented to model the robot-robot and robot-environment interactions. A soft wall-following rule is developed to improve the trajectory smoothness. Our method belongs to reactive planning, but environment properties are implicitly encoded. The total amount of robots in our method can be scaled up to any number. The performance improvement over a vanilla APF and RL method has been demonstrated via numerical simulations. Experiments are also performed using quadrotors to further illustrate the competence of our method.

* 8 pages, accepted by IROS 2023. arXiv admin note: substantial text overlap with arXiv:2306.07647 
Viaarxiv icon

Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution

Jun 13, 2023
Dengyu Zhang, Guobin Zhu, Qingrui Zhang

Figure 1 for Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution
Figure 2 for Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution
Figure 3 for Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution
Figure 4 for Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution

Motion planning is a crucial aspect of robot autonomy as it involves identifying a feasible motion path to a destination while taking into consideration various constraints, such as input, safety, and performance constraints, without violating either system or environment boundaries. This becomes particularly challenging when multiple robots run without communication, which compromises their real-time efficiency, safety, and performance. In this paper, we present a learning-based potential field algorithm that incorporates deep reinforcement learning into an artificial potential field (APF). Specifically, we introduce an observation embedding mechanism that pre-processes dynamic information about the environment and develop a soft wall-following rule to improve trajectory smoothness. Our method, while belonging to reactive planning, implicitly encodes environmental properties. Additionally, our approach can scale up to any number of robots and has demonstrated superior performance compared to APF and RL through numerical simulations. Finally, experiments are conducted to highlight the effectiveness of our proposed method.

* 6 pages 
Viaarxiv icon

Policy Transfer via Enhanced Action Space

Dec 07, 2022
Zheng Zhang, Qingrui Zhang, Bo Zhu, Xiaohan Wang, Tianjiang Hu

Figure 1 for Policy Transfer via Enhanced Action Space
Figure 2 for Policy Transfer via Enhanced Action Space
Figure 3 for Policy Transfer via Enhanced Action Space
Figure 4 for Policy Transfer via Enhanced Action Space

Though transfer learning is promising to increase the learning efficiency, the existing methods are still subject to the challenges from long-horizon tasks, especially when expert policies are sub-optimal and partially useful. Hence, a novel algorithm named EASpace (Enhanced Action Space) is proposed in this paper to transfer the knowledge of multiple sub-optimal expert policies. EASpace formulates each expert policy into multiple macro actions with different execution time period, then integrates all macro actions into the primitive action space directly. Through this formulation, the proposed EASpace could learn when to execute which expert policy and how long it lasts. An intra-macro-action learning rule is proposed by adjusting the temporal difference target of macro actions to improve the data efficiency and alleviate the non-stationarity issue in multi-agent settings. Furthermore, an additional reward proportional to the execution time of macro actions is introduced to encourage the environment exploration via macro actions, which is significant to learn a long-horizon task. Theoretical analysis is presented to show the convergence of the proposed algorithm. The efficiency of the proposed algorithm is illustrated by a grid-based game and a multi-agent pursuit problem. The proposed algorithm is also implemented to real physical systems to justify its effectiveness.

* 14 pages 
Viaarxiv icon

Multi-robot Cooperative Pursuit via Potential Field-Enhanced Reinforcement Learning

Mar 09, 2022
Zheng Zhang, Xiaohan Wang, Qingrui Zhang, Tianjiang Hu

Figure 1 for Multi-robot Cooperative Pursuit via Potential Field-Enhanced Reinforcement Learning
Figure 2 for Multi-robot Cooperative Pursuit via Potential Field-Enhanced Reinforcement Learning
Figure 3 for Multi-robot Cooperative Pursuit via Potential Field-Enhanced Reinforcement Learning
Figure 4 for Multi-robot Cooperative Pursuit via Potential Field-Enhanced Reinforcement Learning

It is of great challenge, though promising, to coordinate collective robots for hunting an evader in a decentralized manner purely in light of local observations. In this paper, this challenge is addressed by a novel hybrid cooperative pursuit algorithm that combines reinforcement learning with the artificial potential field method. In the proposed algorithm, decentralized deep reinforcement learning is employed to learn cooperative pursuit policies that are adaptive to dynamic environments. The artificial potential field method is integrated into the learning process as predefined rules to improve the data efficiency and generalization ability. It is shown by numerical simulations that the proposed hybrid design outperforms the pursuit policies either learned from vanilla reinforcement learning or designed by the potential field method. Furthermore, experiments are conducted by transferring the learned pursuit policies into real-world mobile robots. Experimental results demonstrate the feasibility and potential of the proposed algorithm in learning multiple cooperative pursuit strategies.

* This paper has been accepted by ICRA 2022 
Viaarxiv icon

Reinforcement Learning Compensated Extended Kalman Filter for Attitude Estimation

Jul 27, 2021
Yujie Tang, Liang Hu, Qingrui Zhang, Wei Pan

Figure 1 for Reinforcement Learning Compensated Extended Kalman Filter for Attitude Estimation
Figure 2 for Reinforcement Learning Compensated Extended Kalman Filter for Attitude Estimation
Figure 3 for Reinforcement Learning Compensated Extended Kalman Filter for Attitude Estimation
Figure 4 for Reinforcement Learning Compensated Extended Kalman Filter for Attitude Estimation

Inertial measurement units are widely used in different fields to estimate the attitude. Many algorithms have been proposed to improve estimation performance. However, most of them still suffer from 1) inaccurate initial estimation, 2) inaccurate initial filter gain, and 3) non-Gaussian process and/or measurement noise. In this paper, we leverage reinforcement learning to compensate for the classical extended Kalman filter estimation, i.e., to learn the filter gain from the sensor measurements. We also analyse the convergence of the estimate error. The effectiveness of the proposed algorithm is validated on both simulated data and real data.

* This paper has been accepted by IROS 2021 
Viaarxiv icon

Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent Control

Sep 20, 2020
Qingrui Zhang, Hao Dong, Wei Pan

Figure 1 for Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent Control
Figure 2 for Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent Control
Figure 3 for Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent Control

Decentralized multi-agent control has broad applications, ranging from multi-robot cooperation to distributed sensor networks. In decentralized multi-agent control, systems are complex with unknown or highly uncertain dynamics, where traditional model-based control methods can hardly be applied. Compared with model-based control in control theory, deep reinforcement learning (DRL) is promising to learn the controller/policy from data without the knowing system dynamics. However, to directly apply DRL to decentralized multi-agent control is challenging, as interactions among agents make the learning environment non-stationary. More importantly, the existing multi-agent reinforcement learning (MARL) algorithms cannot ensure the closed-loop stability of a multi-agent system from a control-theoretic perspective, so the learned control polices are highly possible to generate abnormal or dangerous behaviors in real applications. Hence, without stability guarantee, the application of the existing MARL algorithms to real multi-agent systems is of great concern, e.g., UAVs, robots, and power systems, etc. In this paper, we aim to propose a new MARL algorithm for decentralized multi-agent control with a stability guarantee. The new MARL algorithm, termed as a multi-agent soft-actor critic (MASAC), is proposed under the well-known framework of "centralized-training-with-decentralized-execution". The closed-loop stability is guaranteed by the introduction of a stability constraint during the policy improvement in our MASAC algorithm. The stability constraint is designed based on Lyapunov's method in control theory. To demonstrate the effectiveness, we present a multi-agent navigation example to show the efficiency of the proposed MASAC algorithm.

* Accepted to The 2nd International Conference on Distributed Artificial Intelligence 
Viaarxiv icon

Model-Reference Reinforcement Learning for Collision-Free Tracking Control of Autonomous Surface Vehicles

Aug 17, 2020
Qingrui Zhang, Wei Pan, Vasso Reppa

Figure 1 for Model-Reference Reinforcement Learning for Collision-Free Tracking Control of Autonomous Surface Vehicles
Figure 2 for Model-Reference Reinforcement Learning for Collision-Free Tracking Control of Autonomous Surface Vehicles
Figure 3 for Model-Reference Reinforcement Learning for Collision-Free Tracking Control of Autonomous Surface Vehicles
Figure 4 for Model-Reference Reinforcement Learning for Collision-Free Tracking Control of Autonomous Surface Vehicles

This paper presents a novel model-reference reinforcement learning algorithm for the intelligent tracking control of uncertain autonomous surface vehicles with collision avoidance. The proposed control algorithm combines a conventional control method with reinforcement learning to enhance control accuracy and intelligence. In the proposed control design, a nominal system is considered for the design of a baseline tracking controller using a conventional control approach. The nominal system also defines the desired behaviour of uncertain autonomous surface vehicles in an obstacle-free environment. Thanks to reinforcement learning, the overall tracking controller is capable of compensating for model uncertainties and achieving collision avoidance at the same time in environments with obstacles. In comparison to traditional deep reinforcement learning methods, our proposed learning-based control can provide stability guarantees and better sample efficiency. We demonstrate the performance of the new algorithm using an example of autonomous surface vehicles.

* Extension of arXiv:2003.13839 
Viaarxiv icon

Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties

Mar 30, 2020
Qingrui Zhang, Wei Pan, Vasso Reppa

Figure 1 for Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties
Figure 2 for Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties
Figure 3 for Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties
Figure 4 for Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties

This paper presents a novel model-reference reinforcement learning control method for uncertain autonomous surface vehicles. The proposed control combines a conventional control method with deep reinforcement learning. With the conventional control, we can ensure the learning-based control law provides closed-loop stability for the overall system, and potentially increase the sample efficiency of the deep reinforcement learning. With the reinforcement learning, we can directly learn a control law to compensate for modeling uncertainties. In the proposed control, a nominal system is employed for the design of a baseline control law using a conventional control approach. The nominal system also defines the desired performance for uncertain autonomous vehicles to follow. In comparison with traditional deep reinforcement learning methods, our proposed learning-based control can provide stability guarantees and better sample efficiency. We demonstrate the performance of the new algorithm via extensive simulation results.

* 9 pages, 10 figures, 
Viaarxiv icon

Robust Cooperative Formation Control of Fixed-Wing Unmanned Aerial Vehicles

May 06, 2019
Qingrui Zhang, Hugh H. T. Liu

Figure 1 for Robust Cooperative Formation Control of Fixed-Wing Unmanned Aerial Vehicles
Figure 2 for Robust Cooperative Formation Control of Fixed-Wing Unmanned Aerial Vehicles
Figure 3 for Robust Cooperative Formation Control of Fixed-Wing Unmanned Aerial Vehicles
Figure 4 for Robust Cooperative Formation Control of Fixed-Wing Unmanned Aerial Vehicles

Robust cooperative formation control is investigated in this paper for fixed-wing unmanned aerial vehicles in close formation flight to save energy. A novel cooperative control method is developed. The concept of virtual structure is employed to resolve the difficulty in designing virtual leaders for a large number of UAVs in formation flight. To improve the transient performance, desired trajectories are passed through a group of cooperative filters to generate smooth reference signals, namely the states of the virtual leaders. Model uncertainties due to aerodynamic couplings among UAVs are estimated and compensated using uncertainty and disturbance observers. The entire design, therefore, contains three major components: cooperative filters for motion planning, baseline cooperative control, and uncertainty and disturbance observation. The proposed formation controller could at least secure ultimate bounded control performance for formation tracking. If certain conditions are satisfied, asymptotic formation tracking control could be obtained. Major contributions of this paper lie in two aspects: 1) the difficulty in designing virtual leaders is resolved in terms of the virtual structure concept; 2) a robust cooperative controller is proposed for close formation flight of a large number of UAVs suffering from aerodynamic couplings in between. The efficiency of the proposed design will be demonstrated using numerical simulations of five UAVs in close formation flight.

Viaarxiv icon