Alert button
Picture for Yuying Zhu

Yuying Zhu

Alert button

DenseLight: Efficient Control for Large-scale Traffic Signals with Dense Feedback

Jun 13, 2023
Junfan Lin, Yuying Zhu, Lingbo Liu, Yang Liu, Guanbin Li, Liang Lin

Figure 1 for DenseLight: Efficient Control for Large-scale Traffic Signals with Dense Feedback
Figure 2 for DenseLight: Efficient Control for Large-scale Traffic Signals with Dense Feedback
Figure 3 for DenseLight: Efficient Control for Large-scale Traffic Signals with Dense Feedback
Figure 4 for DenseLight: Efficient Control for Large-scale Traffic Signals with Dense Feedback

Traffic Signal Control (TSC) aims to reduce the average travel time of vehicles in a road network, which in turn enhances fuel utilization efficiency, air quality, and road safety, benefiting society as a whole. Due to the complexity of long-horizon control and coordination, most prior TSC methods leverage deep reinforcement learning (RL) to search for a control policy and have witnessed great success. However, TSC still faces two significant challenges. 1) The travel time of a vehicle is delayed feedback on the effectiveness of TSC policy at each traffic intersection since it is obtained after the vehicle has left the road network. Although several heuristic reward functions have been proposed as substitutes for travel time, they are usually biased and not leading the policy to improve in the correct direction. 2) The traffic condition of each intersection is influenced by the non-local intersections since vehicles traverse multiple intersections over time. Therefore, the TSC agent is required to leverage both the local observation and the non-local traffic conditions to predict the long-horizontal traffic conditions of each intersection comprehensively. To address these challenges, we propose DenseLight, a novel RL-based TSC method that employs an unbiased reward function to provide dense feedback on policy effectiveness and a non-local enhanced TSC agent to better predict future traffic conditions for more precise traffic control. Extensive experiments and ablation studies demonstrate that DenseLight can consistently outperform advanced baselines on various road networks with diverse traffic flows. The code is available at https://github.com/junfanlin/DenseLight.

* This work is accepted by IJCAI2023 
Viaarxiv icon

Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation

Aug 02, 2021
Lingbo Liu, Yuying Zhu, Guanbin Li, Ziyi Wu, Lei Bai Liang Lin

Figure 1 for Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation
Figure 2 for Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation
Figure 3 for Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation
Figure 4 for Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation

Metro origin-destination prediction is a crucial yet challenging time-series analysis task in intelligent transportation systems, which aims to accurately forecast two specific types of cross-station ridership, i.e., Origin-Destination (OD) one and Destination-Origin (DO) one. However, complete OD matrices of previous time intervals can not be obtained immediately in online metro systems, and conventional methods only used limited information to forecast the future OD and DO ridership separately. In this work, we proposed a novel neural network module termed Heterogeneous Information Aggregation Machine (HIAM), which fully exploits heterogeneous information of historical data (e.g., incomplete OD matrices, unfinished order vectors, and DO matrices) to jointly learn the evolutionary patterns of OD and DO ridership. Specifically, an OD modeling branch estimates the potential destinations of unfinished orders explicitly to complement the information of incomplete OD matrices, while a DO modeling branch takes DO matrices as input to capture the spatial-temporal distribution of DO ridership. Moreover, a Dual Information Transformer is introduced to propagate the mutual information among OD features and DO features for modeling the OD-DO causality and correlation. Based on the proposed HIAM, we develop a unified Seq2Seq network to forecast the future OD and DO ridership simultaneously. Extensive experiments conducted on two large-scale benchmarks demonstrate the effectiveness of our method for online metro origin-destination prediction.

Viaarxiv icon

MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences

Oct 22, 2020
Jianing Yang, Yongxin Wang, Ruitao Yi, Yuying Zhu, Azaan Rehman, Amir Zadeh, Soujanya Poria, Louis-Philippe Morency

Figure 1 for MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences
Figure 2 for MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences
Figure 3 for MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences
Figure 4 for MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences

Human communication is multimodal in nature; it is through multiple modalities, i.e., language, voice, and facial expressions, that opinions and emotions are expressed. Data in this domain exhibits complex multi-relational and temporal interactions. Learning from this data is a fundamentally challenging research problem. In this paper, we propose Multimodal Temporal Graph Attention Networks (MTGAT). MTGAT is an interpretable graph-based neural model that provides a suitable framework for analyzing this type of multimodal sequential data. We first introduce a procedure to convert unaligned multimodal sequence data into a graph with heterogeneous nodes and edges that captures the rich interactions between different modalities through time. Then, a novel graph operation, called Multimodal Temporal Graph Attention, along with a dynamic pruning and read-out technique is designed to efficiently process this multimodal temporal graph. By learning to focus only on the important interactions within the graph, our MTGAT is able to achieve state-of-the-art performance on multimodal sentiment analysis and emotion recognition benchmarks including IEMOCAP and CMU-MOSI, while utilizing significantly fewer computations.

Viaarxiv icon

What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets

Jul 07, 2020
Jianing Yang, Yuying Zhu, Yongxin Wang, Ruitao Yi, Amir Zadeh, Louis-Philippe Morency

Figure 1 for What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets
Figure 2 for What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets
Figure 3 for What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets
Figure 4 for What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets

Question answering biases in video QA datasets can mislead multimodal model to overfit to QA artifacts and jeopardize the model's ability to generalize. Understanding how strong these QA biases are and where they come from helps the community measure progress more accurately and provide researchers insights to debug their models. In this paper, we analyze QA biases in popular video question answering datasets and discover pretrained language models can answer 37-48% questions correctly without using any multimodal context information, far exceeding the 20% random guess baseline for 5-choose-1 multiple-choice questions. Our ablation study shows biases can come from annotators and type of questions. Specifically, annotators that have been seen during training are better predicted by the model and reasoning, abstract questions incur more biases than factual, direct questions. We also show empirically that using annotator-non-overlapping train-test splits can reduce QA biases for video QA datasets.

Viaarxiv icon

CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition

Apr 30, 2019
Yuying Zhu, Guoxin Wang, Börje F. Karlsson

Figure 1 for CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition
Figure 2 for CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition
Figure 3 for CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition
Figure 4 for CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition

Named entity recognition (NER) in Chinese is essential but difficult because of the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually considered as the first step for Chinese NER. However, models based on word-level embeddings and lexicon features often suffer from segmentation errors and out-of-vocabulary (OOV) words. In this paper, we investigate a Convolutional Attention Network called CAN for Chinese NER, which consists of a character-based convolutional neural network (CNN) with local-attention layer and a gated recurrent unit (GRU) with global self-attention layer to capture the information from adjacent characters and sentence contexts. Also, compared to other models, not depending on any external resources like lexicons and employing small size of char embeddings make our model more practical. Extensive experimental results show that our approach outperforms state-of-the-art methods without word embedding and external lexicon resources on different domain datasets including Weibo, MSRA and Chinese Resume NER dataset.

* This paper is accepted by NAACL-HLT 2019 
Viaarxiv icon

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Apr 03, 2019
Yuying Zhu, Guoxin Wang, Börje F. Karlsson

Figure 1 for CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition
Figure 2 for CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition
Figure 3 for CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition
Figure 4 for CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Named entity recognition (NER) in Chinese is essential but difficult because of the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually considered as the first step for Chinese NER. However, models based on word-level embeddings and lexicon features often suffer from segmentation errors and out-of-vocabulary (OOV) words. In this paper, we investigate a Convolutional Attention Network called CAN for Chinese NER, which consists of a character-based convolutional neural network (CNN) with local-attention layer and a gated recurrent unit (GRU) with global self-attention layer to capture the information from adjacent characters and sentence contexts. Also, compared to other models, not depending on any external resources like lexicons and employing small size of char embeddings make our model more practical. Extensive experimental results show that our approach outperforms state-of-the-art methods without word embedding and external lexicon resources on different domain datasets including Weibo, MSRA and Chinese Resume NER dataset.

Viaarxiv icon