Alert button
Picture for Tianyu Shi

Tianyu Shi

Alert button

Improving the generalizability and robustness of large-scale traffic signal control

Jun 08, 2023
Tianyu Shi, Francois-Xavier Devailly, Denis Larocque, Laurent Charlin

Figure 1 for Improving the generalizability and robustness of large-scale traffic signal control
Figure 2 for Improving the generalizability and robustness of large-scale traffic signal control
Figure 3 for Improving the generalizability and robustness of large-scale traffic signal control
Figure 4 for Improving the generalizability and robustness of large-scale traffic signal control

A number of deep reinforcement-learning (RL) approaches propose to control traffic signals. In this work, we study the robustness of such methods along two axes. First, sensor failures and GPS occlusions create missing-data challenges and we show that recent methods remain brittle in the face of these missing data. Second, we provide a more systematic study of the generalization ability of RL methods to new networks with different traffic regimes. Again, we identify the limitations of recent approaches. We then propose using a combination of distributional and vanilla reinforcement learning through a policy ensemble. Building upon the state-of-the-art previous model which uses a decentralized approach for large-scale traffic signal control with graph convolutional networks (GCNs), we first learn models using a distributional reinforcement learning (DisRL) approach. In particular, we use implicit quantile networks (IQN) to model the state-action return distribution with quantile regression. For traffic signal control problems, an ensemble of standard RL and DisRL yields superior performance across different scenarios, including different levels of missing sensor data and traffic flow patterns. Furthermore, the learning scheme of the resulting model can improve zero-shot transferability to different road network structures, including both synthetic networks and real-world networks (e.g., Luxembourg, Manhattan). We conduct extensive experiments to compare our approach to multi-agent reinforcement learning and traditional transportation approaches. Results show that the proposed method improves robustness and generalizability in the face of missing data, varying road networks, and traffic flows.

Viaarxiv icon

Fast Rule-Based Decoding: Revisiting Syntactic Rules in Neural Constituency Parsing

Dec 16, 2022
Tianyu Shi, Zhicheng Wang, Liyin Xiao, Cong Liu

Figure 1 for Fast Rule-Based Decoding: Revisiting Syntactic Rules in Neural Constituency Parsing
Figure 2 for Fast Rule-Based Decoding: Revisiting Syntactic Rules in Neural Constituency Parsing
Figure 3 for Fast Rule-Based Decoding: Revisiting Syntactic Rules in Neural Constituency Parsing
Figure 4 for Fast Rule-Based Decoding: Revisiting Syntactic Rules in Neural Constituency Parsing

Most recent studies on neural constituency parsing focus on encoder structures, while few developments are devoted to decoders. Previous research has demonstrated that probabilistic statistical methods based on syntactic rules are particularly effective in constituency parsing, whereas syntactic rules are not used during the training of neural models in prior work probably due to their enormous computation requirements. In this paper, we first implement a fast CKY decoding procedure harnessing GPU acceleration, based on which we further derive a syntactic rule-based (rule-constrained) CKY decoding. In the experiments, our method obtains 95.89 and 92.52 F1 on the datasets of PTB and CTB respectively, which shows significant improvements compared with previous approaches. Besides, our parser achieves strong and competitive cross-domain performance in zero-shot settings.

Viaarxiv icon

Joint Chinese Word Segmentation and Span-based Constituency Parsing

Nov 03, 2022
Zhicheng Wang, Tianyu Shi, Cong Liu

Figure 1 for Joint Chinese Word Segmentation and Span-based Constituency Parsing
Figure 2 for Joint Chinese Word Segmentation and Span-based Constituency Parsing
Figure 3 for Joint Chinese Word Segmentation and Span-based Constituency Parsing
Figure 4 for Joint Chinese Word Segmentation and Span-based Constituency Parsing

In constituency parsing, span-based decoding is an important direction. However, for Chinese sentences, because of their linguistic characteristics, it is necessary to utilize other models to perform word segmentation first, which introduces a series of uncertainties and generally leads to errors in the computation of the constituency tree afterward. This work proposes a method for joint Chinese word segmentation and Span-based Constituency Parsing by adding extra labels to individual Chinese characters on the parse trees. Through experiments, the proposed algorithm outperforms the recent models for joint segmentation and constituency parsing on CTB 5.1.

Viaarxiv icon

Order-sensitive Neural Constituency Parsing

Nov 01, 2022
Zhicheng Wang, Tianyu Shi, Liyin Xiao, Cong Liu

Figure 1 for Order-sensitive Neural Constituency Parsing
Figure 2 for Order-sensitive Neural Constituency Parsing
Figure 3 for Order-sensitive Neural Constituency Parsing
Figure 4 for Order-sensitive Neural Constituency Parsing

We propose a novel algorithm that improves on the previous neural span-based CKY decoder for constituency parsing. In contrast to the traditional span-based decoding, where spans are combined only based on the sum of their scores, we introduce an order-sensitive strategy, where the span combination scores are more carefully derived from an order-sensitive basis. Our decoder can be regarded as a generalization over existing span-based decoder in determining a finer-grain scoring scheme for the combination of lower-level spans into higher-level spans, where we emphasize on the order of the lower-level spans and use order-sensitive span scores as well as order-sensitive combination grammar rule scores to enhance prediction accuracy. We implement the proposed decoding strategy harnessing GPU parallelism and achieve a decoding speed on par with state-of-the-art span-based parsers. Using the previous state-of-the-art model without additional data as our baseline, we outperform it and improve the F1 score on the Penn Treebank Dataset by 0.26% and on the Chinese Treebank Dataset by 0.35%.

* Paper presented at The 34th IEEE International Conference on Tools with Artificial Intelligence (ICTAI) 
Viaarxiv icon

WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based Environments

Oct 14, 2022
Xi Chen, Tianyu Shi, Qingpeng Zhao, Yuchen Sun, Yunfei Gao, Xiangjun Wang

Figure 1 for WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based Environments
Figure 2 for WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based Environments
Figure 3 for WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based Environments
Figure 4 for WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based Environments

Recent advances in deep reinforcement learning (RL) have demonstrated complex decision-making capabilities in simulation environments such as Arcade Learning Environment, MuJoCo, and ViZDoom. However, they are hardly extensible to more complicated problems, mainly due to the lack of complexity and variations in the environments they are trained and tested on. Furthermore, they are not extensible to an open-world environment to facilitate long-term exploration research. To learn realistic task-solving capabilities, we need to develop an environment with greater diversity and complexity. We developed WILD-SCAV, a powerful and extensible environment based on a 3D open-world FPS (First-Person Shooter) game to bridge the gap. It provides realistic 3D environments of variable complexity, various tasks, and multiple modes of interaction, where agents can learn to perceive 3D environments, navigate and plan, compete and cooperate in a human-like manner. WILD-SCAV also supports different complexities, such as configurable maps with different terrains, building structures and distributions, and multi-agent settings with cooperative and competitive tasks. The experimental results on configurable complexity, multi-tasking, and multi-agent scenarios demonstrate the effectiveness of WILD-SCAV in benchmarking various RL algorithms, as well as it is potential to give rise to intelligent agents with generalized task-solving abilities. The link to our open-sourced code can be found here https://github.com/inspirai/wilderness-scavenger.

Viaarxiv icon

Towards Modern Card Games with Large-Scale Action Spaces Through Action Representation

Jun 25, 2022
Zhiyuan Yao, Tianyu Shi, Site Li, Yiting Xie, Yuanyuan Qin, Xiongjie Xie, Huan Lu, Yan Zhang

Figure 1 for Towards Modern Card Games with Large-Scale Action Spaces Through Action Representation
Figure 2 for Towards Modern Card Games with Large-Scale Action Spaces Through Action Representation
Figure 3 for Towards Modern Card Games with Large-Scale Action Spaces Through Action Representation
Figure 4 for Towards Modern Card Games with Large-Scale Action Spaces Through Action Representation

Axie infinity is a complicated card game with a huge-scale action space. This makes it difficult to solve this challenge using generic Reinforcement Learning (RL) algorithms. We propose a hybrid RL framework to learn action representations and game strategies. To avoid evaluating every action in the large feasible action set, our method evaluates actions in a fixed-size set which is determined using action representations. We compare the performance of our method with the other two baseline methods in terms of their sample efficiency and the winning rates of the trained models. We empirically show that our method achieves an overall best winning rate and the best sample efficiency among the three methods.

* Accpeted as IEEE CoG2022 proceedings paper 
Viaarxiv icon

Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model

Mar 03, 2022
Tianyu Shi, Yifei Ai, Omar ElSamadisy, Baher Abdulhai

Figure 1 for Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model
Figure 2 for Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model
Figure 3 for Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model
Figure 4 for Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model

In the coming years and decades, autonomous vehicles (AVs) will become increasingly prevalent, offering new opportunities for safer and more convenient travel and potentially smarter traffic control methods exploiting automation and connectivity. Car following is a prime function in autonomous driving. Car following based on reinforcement learning has received attention in recent years with the goal of learning and achieving performance levels comparable to humans. However, most existing RL methods model car following as a unilateral problem, sensing only the vehicle ahead. Recent literature, however, Wang and Horn [16] has shown that bilateral car following that considers the vehicle ahead and the vehicle behind exhibits better system stability. In this paper we hypothesize that this bilateral car following can be learned using RL, while learning other goals such as efficiency maximisation, jerk minimization, and safety rewards leading to a learned model that outperforms human driving. We propose and introduce a Deep Reinforcement Learning (DRL) framework for car following control by integrating bilateral information into both state and reward function based on the bilateral control model (BCM) for car following control. Furthermore, we use a decentralized multi-agent reinforcement learning framework to generate the corresponding control action for each agent. Our simulation results demonstrate that our learned policy is better than the human driving policy in terms of (a) inter-vehicle headways, (b) average speed, (c) jerk, (d) Time to Collision (TTC) and (e) string stability.

Viaarxiv icon

Offline Reinforcement Learning for Autonomous Driving with Safety and Exploration Enhancement

Nov 02, 2021
Tianyu Shi, Dong Chen, Kaian Chen, Zhaojian Li

Figure 1 for Offline Reinforcement Learning for Autonomous Driving with Safety and Exploration Enhancement
Figure 2 for Offline Reinforcement Learning for Autonomous Driving with Safety and Exploration Enhancement
Figure 3 for Offline Reinforcement Learning for Autonomous Driving with Safety and Exploration Enhancement
Figure 4 for Offline Reinforcement Learning for Autonomous Driving with Safety and Exploration Enhancement

Reinforcement learning (RL) is a powerful data-driven control method that has been largely explored in autonomous driving tasks. However, conventional RL approaches learn control policies through trial-and-error interactions with the environment and therefore may cause disastrous consequences such as collisions when testing in real-world traffic. Offline RL has recently emerged as a promising framework to learn effective policies from previously-collected, static datasets without the requirement of active interactions, making it especially appealing for autonomous driving applications. Despite promising, existing offline RL algorithms such as Batch-Constrained deep Q-learning (BCQ) generally lead to rather conservative policies with limited exploration efficiency. To address such issues, this paper presents an enhanced BCQ algorithm by employing a learnable parameter noise scheme in the perturbation model to increase the diversity of observed actions. In addition, a Lyapunov-based safety enhancement strategy is incorporated to constrain the explorable state space within a safe region. Experimental results in highway and parking traffic scenarios show that our approach outperforms the conventional RL method, as well as state-of-the-art offline RL algorithms.

* Machine Learning for Autonomous Driving Workshop on NeurIPS 2021 
Viaarxiv icon

Towards Efficient Connected and Automated Driving System via Multi-agent Graph Reinforcement Learning

Jul 08, 2020
Tianyu Shi, Jiawei Wang, Yuankai Wu, Lijun Sun

Figure 1 for Towards Efficient Connected and Automated Driving System via Multi-agent Graph Reinforcement Learning
Figure 2 for Towards Efficient Connected and Automated Driving System via Multi-agent Graph Reinforcement Learning
Figure 3 for Towards Efficient Connected and Automated Driving System via Multi-agent Graph Reinforcement Learning
Figure 4 for Towards Efficient Connected and Automated Driving System via Multi-agent Graph Reinforcement Learning

Connected and automated vehicles (CAVs) have attracted more and more attention recently. The fast actuation time allows them having the potential to promote the efficiency and safety of the whole transportation system. Due to technical challenges, there will be a proportion of vehicles that can be equipped with automation while other vehicles are without automation. Instead of learning a reliable behavior for ego automated vehicle, we focus on how to improve the outcomes of the total transportation system by allowing each automated vehicle to learn cooperation with each other and regulate human-driven traffic flow. One of state of the art method is using reinforcement learning to learn intelligent decision making policy. However, direct reinforcement learning framework cannot improve the performance of the whole system. In this article, we demonstrate that considering the problem in multi-agent setting with shared policy can help achieve better system performance than non-shared policy in single-agent setting. Furthermore, we find that utilization of attention mechanism on interaction features can capture the interplay between each agent in order to boost cooperation. To the best of our knowledge, while previous automated driving studies mainly focus on enhancing individual's driving performance, this work serves as a starting point for research on system-level multi-agent cooperation performance using graph information sharing. We conduct extensive experiments in car-following and unsignalized intersection settings. The results demonstrate that CAVs controlled by our method can achieve the best performance against several state of the art baselines.

Viaarxiv icon