Alert button
Picture for Jianqiang Wang

Jianqiang Wang

Alert button

MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

Jul 30, 2023
Jiaqi Liu, Peng Hang, Xiao qi, Jianqiang Wang, Jian Sun

Figure 1 for MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections
Figure 2 for MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections
Figure 3 for MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections
Figure 4 for MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

Autonomous driving technology is poised to transform transportation systems. However, achieving safe and accurate multi-task decision-making in complex scenarios, such as unsignalized intersections, remains a challenge for autonomous vehicles. This paper presents a novel approach to this issue with the development of a Multi-Task Decision-Making Generative Pre-trained Transformer (MTD-GPT) model. Leveraging the inherent strengths of reinforcement learning (RL) and the sophisticated sequence modeling capabilities of the Generative Pre-trained Transformer (GPT), the MTD-GPT model is designed to simultaneously manage multiple driving tasks, such as left turns, straight-ahead driving, and right turns at unsignalized intersections. We initially train a single-task RL expert model, sample expert data in the environment, and subsequently utilize a mixed multi-task dataset for offline GPT training. This approach abstracts the multi-task decision-making problem in autonomous driving as a sequence modeling task. The MTD-GPT model is trained and evaluated across several decision-making tasks, demonstrating performance that is either superior or comparable to that of state-of-the-art single-task decision-making models.

* Accepted by ITSC 2023 
Viaarxiv icon

A Survey on Datasets for Decision-making of Autonomous Vehicle

Jun 29, 2023
Yuning Wang, Zeyu Han, Yining Xing, Shaobing Xu, Jianqiang Wang

Figure 1 for A Survey on Datasets for Decision-making of Autonomous Vehicle
Figure 2 for A Survey on Datasets for Decision-making of Autonomous Vehicle
Figure 3 for A Survey on Datasets for Decision-making of Autonomous Vehicle
Figure 4 for A Survey on Datasets for Decision-making of Autonomous Vehicle

Autonomous vehicles (AV) are expected to reshape future transportation systems, and decision-making is one of the critical modules toward high-level automated driving. To overcome those complicated scenarios that rule-based methods could not cope with well, data-driven decision-making approaches have aroused more and more focus. The datasets to be used in developing data-driven methods dramatically influences the performance of decision-making, hence it is necessary to have a comprehensive insight into the existing datasets. From the aspects of collection sources, driving data can be divided into vehicle, environment, and driver related data. This study compares the state-of-the-art datasets of these three categories and summarizes their features including sensors used, annotation, and driving scenarios. Based on the characteristics of the datasets, this survey also concludes the potential applications of datasets on various aspects of AV decision-making, assisting researchers to find appropriate ones to support their own research. The future trends of AV dataset development are summarized.

Viaarxiv icon

4D Millimeter-Wave Radar in Autonomous Driving: A Survey

Jun 14, 2023
Zeyu Han, Jiahao Wang, Zikun Xu, Shuocheng Yang, Lei He, Shaobing Xu, Jianqiang Wang

Figure 1 for 4D Millimeter-Wave Radar in Autonomous Driving: A Survey
Figure 2 for 4D Millimeter-Wave Radar in Autonomous Driving: A Survey
Figure 3 for 4D Millimeter-Wave Radar in Autonomous Driving: A Survey
Figure 4 for 4D Millimeter-Wave Radar in Autonomous Driving: A Survey

The 4D millimeter-wave (mmWave) radar, capable of measuring the range, azimuth, elevation, and velocity of targets, has attracted considerable interest in the autonomous driving community. This is attributed to its robustness in extreme environments and outstanding velocity and elevation measurement capabilities. However, despite the rapid development of research related to its sensing theory and application, there is a notable lack of surveys on the topic of 4D mmWave radar. To address this gap and foster future research in this area, this paper presents a comprehensive survey on the use of 4D mmWave radar in autonomous driving. Reviews on the theoretical background and progress of 4D mmWave radars are presented first, including the signal processing flow, resolution improvement ways, extrinsic calibration process, and point cloud generation methods. Then it introduces related datasets and application algorithms in autonomous driving perception and localization and mapping tasks. Finally, this paper concludes by predicting future trends in the field of 4D mmWave radar. To the best of our knowledge, this is the first survey specifically for the 4D mmWave radar.

* 8 pages, 5 figures 
Viaarxiv icon

Lossless Point Cloud Attribute Compression Using Cross-scale, Cross-group, and Cross-color Prediction

Mar 22, 2023
Jianqiang Wang, Dandan Ding, Zhan Ma

Figure 1 for Lossless Point Cloud Attribute Compression Using Cross-scale, Cross-group, and Cross-color Prediction
Figure 2 for Lossless Point Cloud Attribute Compression Using Cross-scale, Cross-group, and Cross-color Prediction
Figure 3 for Lossless Point Cloud Attribute Compression Using Cross-scale, Cross-group, and Cross-color Prediction
Figure 4 for Lossless Point Cloud Attribute Compression Using Cross-scale, Cross-group, and Cross-color Prediction

This work extends the multiscale structure originally developed for point cloud geometry compression to point cloud attribute compression. To losslessly encode the attribute while maintaining a low bitrate, accurate probability prediction is critical. With this aim, we extensively exploit cross-scale, cross-group, and cross-color correlations of point cloud attribute to ensure accurate probability estimation and thus high coding efficiency. Specifically, we first generate multiscale attribute tensors through average pooling, by which, for any two consecutive scales, the decoded lower-scale attribute can be used to estimate the attribute probability in the current scale in one shot. Additionally, in each scale, we perform the probability estimation group-wisely following a predefined grouping pattern. In this way, both cross-scale and (same-scale) cross-group correlations are exploited jointly. Furthermore, cross-color redundancy is removed by allowing inter-color processing for YCoCg/RGB alike multi-channel attributes. The proposed method not only demonstrates state-of-the-art compression efficiency with significant performance gains over the latest G-PCC on various contents but also sustains low complexity with affordable encoding and decoding runtime.

* 10 pages 
Viaarxiv icon

Dynamic Point Cloud Geometry Compression Using Multiscale Inter Conditional Coding

Jan 28, 2023
Jianqiang Wang, Dandan Ding, Hao Chen, Zhan Ma

Figure 1 for Dynamic Point Cloud Geometry Compression Using Multiscale Inter Conditional Coding
Figure 2 for Dynamic Point Cloud Geometry Compression Using Multiscale Inter Conditional Coding
Figure 3 for Dynamic Point Cloud Geometry Compression Using Multiscale Inter Conditional Coding
Figure 4 for Dynamic Point Cloud Geometry Compression Using Multiscale Inter Conditional Coding

This work extends the Multiscale Sparse Representation (MSR) framework developed for static Point Cloud Geometry Compression (PCGC) to support the dynamic PCGC through the use of multiscale inter conditional coding. To this end, the reconstruction of the preceding Point Cloud Geometry (PCG) frame is progressively downscaled to generate multiscale temporal priors which are then scale-wise transferred and integrated with lower-scale spatial priors from the same frame to form the contextual information to improve occupancy probability approximation when processing the current PCG frame from one scale to another. Following the Common Test Conditions (CTC) defined in the standardization committee, the proposed method presents State-Of-The-Art (SOTA) compression performance, yielding 78% lossy BD-Rate gain to the latest standard-compliant V-PCC and 45% lossless bitrate reduction to the latest G-PCC. Even for recently-emerged learning-based solutions, our method still shows significant performance gains.

* 5 pages 
Viaarxiv icon

Mixed Cloud Control Testbed: Validating Vehicle-Road-Cloud Integration via Mixed Digital Twin

Dec 05, 2022
Jianghong Dong, Qing Xu, Jiawei Wang, Chunying Yang, Mengchi Cai, Chaoyi Chen, Jianqiang Wang, Keqiang Li

Figure 1 for Mixed Cloud Control Testbed: Validating Vehicle-Road-Cloud Integration via Mixed Digital Twin
Figure 2 for Mixed Cloud Control Testbed: Validating Vehicle-Road-Cloud Integration via Mixed Digital Twin
Figure 3 for Mixed Cloud Control Testbed: Validating Vehicle-Road-Cloud Integration via Mixed Digital Twin
Figure 4 for Mixed Cloud Control Testbed: Validating Vehicle-Road-Cloud Integration via Mixed Digital Twin

Reliable and efficient validation technologies are critical for the recent development of multi-vehicle cooperation and vehicle-road-cloud integration. In this paper, we introduce our miniature experimental platform, Mixed Cloud Control Testbed (MCCT), developed based on a new notion of Mixed Digital Twin (mixedDT). Combining Mixed Reality with Digital Twin, mixedDT integrates the virtual and physical spaces into a mixed one, where physical entities coexist and interact with virtual entities via their digital counterparts. Under the framework of mixedDT, MCCT contains three major experimental platforms in the physical, virtual and mixed spaces respectively, and provides a unified access for various human-machine interfaces and external devices such as driving simulators. A cloud unit, where the mixed experimental platform is deployed, is responsible for fusing multi-platform information and assigning control instructions, contributing to synchronous operation and real-time cross-platform interaction. Particularly, MCCT allows for multi-vehicle coordination composed of different multi-source vehicles (\eg, physical vehicles, virtual vehicles and human-driven vehicles). Validations on vehicle platooning demonstrate the flexibility and scalability of MCCT.

* 13 pages, 13 figures 
Viaarxiv icon

Synthesize Efficient Safety Certificates for Learning-Based Safe Control using Magnitude Regularization

Sep 23, 2022
Haotian Zheng, Haitong Ma, Sifa Zheng, Shengbo Eben Li, Jianqiang Wang

Figure 1 for Synthesize Efficient Safety Certificates for Learning-Based Safe Control using Magnitude Regularization
Figure 2 for Synthesize Efficient Safety Certificates for Learning-Based Safe Control using Magnitude Regularization
Figure 3 for Synthesize Efficient Safety Certificates for Learning-Based Safe Control using Magnitude Regularization
Figure 4 for Synthesize Efficient Safety Certificates for Learning-Based Safe Control using Magnitude Regularization

Energy-function-based safety certificates can provide provable safety guarantees for the safe control tasks of complex robotic systems. However, all recent studies about learning-based energy function synthesis only consider the feasibility, which might cause over-conservativeness and result in less efficient controllers. In this work, we proposed the magnitude regularization technique to improve the efficiency of safe controllers by reducing the conservativeness inside the energy function while keeping the promising provable safety guarantees. Specifically, we quantify the conservativeness by the magnitude of the energy function, and we reduce the conservativeness by adding a magnitude regularization term to the synthesis loss. We propose the SafeMR algorithm that uses reinforcement learning (RL) for the synthesis to unify the learning processes of safe controllers and energy functions. Experimental results show that the proposed method does reduce the conservativeness of the energy functions and outperforms the baselines in terms of the controller efficiency while guaranteeing safety.

* 8 pages, 6 figures 
Viaarxiv icon

CARNet:Compression Artifact Reduction for Point Cloud Attribute

Sep 17, 2022
Dandan Ding, Junzhe Zhang, Jianqiang Wang, Zhan Ma

Figure 1 for CARNet:Compression Artifact Reduction for Point Cloud Attribute
Figure 2 for CARNet:Compression Artifact Reduction for Point Cloud Attribute
Figure 3 for CARNet:Compression Artifact Reduction for Point Cloud Attribute
Figure 4 for CARNet:Compression Artifact Reduction for Point Cloud Attribute

A learning-based adaptive loop filter is developed for the Geometry-based Point Cloud Compression (G-PCC) standard to reduce attribute compression artifacts. The proposed method first generates multiple Most-Probable Sample Offsets (MPSOs) as potential compression distortion approximations, and then linearly weights them for artifact mitigation. As such, we drive the filtered reconstruction as close to the uncompressed PCA as possible. To this end, we devise a Compression Artifact Reduction Network (CARNet) which consists of two consecutive processing phases: MPSOs derivation and MPSOs combination. The MPSOs derivation uses a two-stream network to model local neighborhood variations from direct spatial embedding and frequency-dependent embedding, where sparse convolutions are utilized to best aggregate information from sparsely and irregularly distributed points. The MPSOs combination is guided by the least square error metric to derive weighting coefficients on the fly to further capture content dynamics of input PCAs. The CARNet is implemented as an in-loop filtering tool of the GPCC, where those linear weighting coefficients are encapsulated into the bitstream with negligible bit rate overhead. Experimental results demonstrate significant improvement over the latest GPCC both subjectively and objectively.

* 13pages, 8figures 
Viaarxiv icon

Efficient LiDAR Point Cloud Geometry Compression Through Neighborhood Point Attention

Aug 26, 2022
Ruixiang Xue, Jianqiang Wang, Zhan Ma

Figure 1 for Efficient LiDAR Point Cloud Geometry Compression Through Neighborhood Point Attention
Figure 2 for Efficient LiDAR Point Cloud Geometry Compression Through Neighborhood Point Attention
Figure 3 for Efficient LiDAR Point Cloud Geometry Compression Through Neighborhood Point Attention
Figure 4 for Efficient LiDAR Point Cloud Geometry Compression Through Neighborhood Point Attention

Although convolutional representation of multiscale sparse tensor demonstrated its superior efficiency to accurately model the occupancy probability for the compression of geometry component of dense object point clouds, its capacity for representing sparse LiDAR point cloud geometry (PCG) was largely limited. This is because 1) fixed receptive field of the convolution cannot characterize extremely and unevenly distributed sparse LiDAR points very well; and 2) pretrained convolutions with fixed weights are insufficient to dynamically capture information conditioned on the input. This work therefore suggests the neighborhood point attention (NPA) to tackle them, where we first use k nearest neighbors (kNN) to construct adaptive local neighborhood; and then leverage the self-attention mechanism to dynamically aggregate information within this neighborhood. Such NPA is devised as a NPAFormer to best exploit cross-scale and same-scale correlations for geometric occupancy probability estimation. Compared with the anchor using standardized G-PCC, our method provides >17% BD-rate gains for lossy compression, and >14% bitrate reduction for lossless scenario using popular LiDAR point clouds in SemanticKITTI and Ford datasets. Compared with the state-of-the-art (SOTA) solution using attention optimized octree coding method, our approach requires much less decoding runtime with about 640 times speedup on average, while still presenting better compression efficiency.

Viaarxiv icon