Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jian Sun

the State Key Lab of Intelligent Control and Decision of Complex Systems and the School of Automation, Beijing Institute of Technology, Beijing, China, Beijing Institute of Technology Chongqing Innovation Center, Chongqing, China

Enhancing Social Decision-Making of Autonomous Vehicles: A Mixed-Strategy Game Approach With Interaction Orientation Identification

Dec 19, 2023

Jiaqi Liu, Xiao Qi, Peng Hang, Jian Sun

Figure 1 for Enhancing Social Decision-Making of Autonomous Vehicles: A Mixed-Strategy Game Approach With Interaction Orientation Identification

Figure 2 for Enhancing Social Decision-Making of Autonomous Vehicles: A Mixed-Strategy Game Approach With Interaction Orientation Identification

Figure 3 for Enhancing Social Decision-Making of Autonomous Vehicles: A Mixed-Strategy Game Approach With Interaction Orientation Identification

Figure 4 for Enhancing Social Decision-Making of Autonomous Vehicles: A Mixed-Strategy Game Approach With Interaction Orientation Identification

Abstract:The integration of Autonomous Vehicles (AVs) into existing human-driven traffic systems poses considerable challenges, especially within environments where human and machine interactions are frequent and complex, such as at unsignalized intersections. Addressing these challenges, we introduce a novel framework predicated on dynamic and socially-aware decision-making game theory to augment the social decision-making prowess of AVs in mixed driving environments.This comprehensive framework is delineated into three primary modules: Social Tendency Recognition, Mixed-Strategy Game Modeling, and Expert Mode Learning. We introduce 'Interaction Orientation' as a metric to evaluate the social decision-making tendencies of various agents, incorporating both environmental factors and trajectory data. The mixed-strategy game model developed as part of this framework considers the evolution of future traffic scenarios and includes a utility function that balances safety, operational efficiency, and the unpredictability of environmental conditions. To adapt to real-world driving complexities, our framework utilizes dynamic optimization techniques for assimilating and learning from expert human driving strategies. These strategies are compiled into a comprehensive library, serving as a reference for future decision-making processes. Our approach is validated through extensive driving datasets, and the results demonstrate marked enhancements in decision timing, precision.

Via

Access Paper or Ask Questions

ImputeFormer: Graph Transformers for Generalizable Spatiotemporal Imputation

Dec 04, 2023

Tong Nie, Guoyang Qin, Yuewen Mei, Jian Sun

Figure 1 for ImputeFormer: Graph Transformers for Generalizable Spatiotemporal Imputation

Figure 2 for ImputeFormer: Graph Transformers for Generalizable Spatiotemporal Imputation

Figure 3 for ImputeFormer: Graph Transformers for Generalizable Spatiotemporal Imputation

Figure 4 for ImputeFormer: Graph Transformers for Generalizable Spatiotemporal Imputation

Abstract:This paper focuses on the multivariate time series imputation problem using deep neural architectures. The ubiquitous issue of missing data in both scientific and engineering tasks necessitates the development of an effective and general imputation model. Leveraging the wisdom and expertise garnered from low-rank imputation methods, we power the canonical Transformers with three key knowledge-driven enhancements, including projected temporal attention, global adaptive graph convolution, and Fourier imputation loss. These task-agnostic inductive biases exploit the inherent structures of incomplete time series, and thus make our model versatile for a variety of imputation problems. We demonstrate its superiority in terms of accuracy, efficiency, and flexibility on heterogeneous datasets, including traffic speed, traffic volume, solar energy, smart metering, and air quality. Comprehensive case studies are performed to further strengthen the interpretability. Promising empirical results provide strong conviction that incorporating time series primitives, such as low-rank properties, can substantially facilitate the development of a generalizable model to approach a wide range of spatiotemporal imputation problems.

Via

Access Paper or Ask Questions

A WINNER+ Based 3-D Non-Stationary Wideband MIMO Channel Model

Dec 01, 2023

Ji Bian, Jian Sun, Cheng-Xiang Wang, Rui Feng, Jie Huang, Yang Yang, Minggao Zhang

Abstract:In this paper, a three-dimensional (3-D) non-stationary wideband multiple-input multiple-output (MIMO) channel model based on the WINNER+ channel model is proposed. The angular distributions of clusters in both the horizontal and vertical planes are jointly considered. The receiver and clusters can be moving, which makes the model more general. Parameters including number of clusters, powers, delays, azimuth angles of departure (AAoDs), azimuth angles of arrival (AAoAs), elevation angles of departure (EAoDs), and elevation angles of arrival (EAoAs) are time-variant. The cluster time evolution is modeled using a birth-death process. Statistical properties, including spatial cross-correlation function (CCF), temporal autocorrelation function (ACF), Doppler power spectrum density (PSD), level-crossing rate (LCR), average fading duration (AFD), and stationary interval are investigated and analyzed. The LCR, AFD, and stationary interval of the proposed channel model are validated against the measurement data. Numerical and simulation results show that the proposed channel model has the ability to reproduce the main properties of real non-stationary channels. Furthermore, the proposed channel model can be adapted to various communication scenarios by adjusting different parameter values.

Via

Access Paper or Ask Questions

Optimal Transport-Guided Conditional Score-Based Diffusion Models

Nov 02, 2023

Xiang Gu, Liwei Yang, Jian Sun, Zongben Xu

Figure 1 for Optimal Transport-Guided Conditional Score-Based Diffusion Models

Figure 2 for Optimal Transport-Guided Conditional Score-Based Diffusion Models

Figure 3 for Optimal Transport-Guided Conditional Score-Based Diffusion Models

Figure 4 for Optimal Transport-Guided Conditional Score-Based Diffusion Models

Abstract:Conditional score-based diffusion model (SBDM) is for conditional generation of target data with paired data as condition, and has achieved great success in image translation. However, it requires the paired data as condition, and there would be insufficient paired data provided in real-world applications. To tackle the applications with partially paired or even unpaired dataset, we propose a novel Optimal Transport-guided Conditional Score-based diffusion model (OTCS) in this paper. We build the coupling relationship for the unpaired or partially paired dataset based on $L_2$-regularized unsupervised or semi-supervised optimal transport, respectively. Based on the coupling relationship, we develop the objective for training the conditional score-based model for unpaired or partially paired settings, which is based on a reformulation and generalization of the conditional SBDM for paired setting. With the estimated coupling relationship, we effectively train the conditional score-based model by designing a ``resampling-by-compatibility'' strategy to choose the sampled data with high compatibility as guidance. Extensive experiments on unpaired super-resolution and semi-paired image-to-image translation demonstrated the effectiveness of the proposed OTCS model. From the viewpoint of optimal transport, OTCS provides an approach to transport data across distributions, which is a challenge for OT on large-scale datasets. We theoretically prove that OTCS realizes the data transport in OT with a theoretical bound. Code is available at \url{https://github.com/XJTU-XGU/OTCS}.

* Accepted in NeurIPS 2023

Via

Access Paper or Ask Questions

STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning

Oct 14, 2023

Weipu Zhang, Gang Wang, Jian Sun, Yetian Yuan, Gao Huang

Abstract:Recently, model-based reinforcement learning algorithms have demonstrated remarkable efficacy in visual input environments. These approaches begin by constructing a parameterized simulation world model of the real environment through self-supervised learning. By leveraging the imagination of the world model, the agent's policy is enhanced without the constraints of sampling from the real environment. The performance of these algorithms heavily relies on the sequence modeling and generation capabilities of the world model. However, constructing a perfectly accurate model of a complex unknown environment is nearly impossible. Discrepancies between the model and reality may cause the agent to pursue virtual goals, resulting in subpar performance in the real environment. Introducing random noise into model-based reinforcement learning has been proven beneficial. In this work, we introduce Stochastic Transformer-based wORld Model (STORM), an efficient world model architecture that combines the strong sequence modeling and generation capabilities of Transformers with the stochastic nature of variational autoencoders. STORM achieves a mean human performance of $126.7\%$ on the Atari $100$k benchmark, setting a new record among state-of-the-art methods that do not employ lookahead search techniques. Moreover, training an agent with $1.85$ hours of real-time interaction experience on a single NVIDIA GeForce RTX 3090 graphics card requires only $4.3$ hours, showcasing improved efficiency compared to previous methodologies.

Via

Access Paper or Ask Questions

Identifying factors associated with fast visual field progression in patients with ocular hypertension based on unsupervised machine learning

Sep 26, 2023

Xiaoqin Huang, Asma Poursoroush, Jian Sun, Michael V. Boland, Chris Johnson, Siamak Yousefi

Abstract:Purpose: To identify ocular hypertension (OHT) subtypes with different trends of visual field (VF) progression based on unsupervised machine learning and to discover factors associated with fast VF progression. Participants: A total of 3133 eyes of 1568 ocular hypertension treatment study (OHTS) participants with at least five follow-up VF tests were included in the study. Methods: We used a latent class mixed model (LCMM) to identify OHT subtypes using standard automated perimetry (SAP) mean deviation (MD) trajectories. We characterized the subtypes based on demographic, clinical, ocular, and VF factors at the baseline. We then identified factors driving fast VF progression using generalized estimating equation (GEE) and justified findings qualitatively and quantitatively. Results: The LCMM model discovered four clusters (subtypes) of eyes with different trajectories of MD worsening. The number of eyes in clusters were 794 (25%), 1675 (54%), 531 (17%) and 133 (4%). We labelled the clusters as Improvers, Stables, Slow progressors, and Fast progressors based on their mean of MD decline, which were 0.08, -0.06, -0.21, and -0.45 dB/year, respectively. Eyes with fast VF progression had higher baseline age, intraocular pressure (IOP), pattern standard deviation (PSD) and refractive error (RE), but lower central corneal thickness (CCT). Fast progression was associated with calcium channel blockers, being male, heart disease history, diabetes history, African American race, stroke history, and migraine headaches.

Via

Access Paper or Ask Questions

Towards Socially Responsive Autonomous Vehicles: A Reinforcement Learning Framework with Driving Priors and Coordination Awareness

Sep 18, 2023

Jiaqi Liu, Donghao Zhou, Peng Hang, Ying Ni, Jian Sun

Figure 1 for Towards Socially Responsive Autonomous Vehicles: A Reinforcement Learning Framework with Driving Priors and Coordination Awareness

Figure 2 for Towards Socially Responsive Autonomous Vehicles: A Reinforcement Learning Framework with Driving Priors and Coordination Awareness

Figure 3 for Towards Socially Responsive Autonomous Vehicles: A Reinforcement Learning Framework with Driving Priors and Coordination Awareness

Figure 4 for Towards Socially Responsive Autonomous Vehicles: A Reinforcement Learning Framework with Driving Priors and Coordination Awareness

Abstract:The advent of autonomous vehicles (AVs) alongside human-driven vehicles (HVs) has ushered in an era of mixed traffic flow, presenting a significant challenge: the intricate interaction between these entities within complex driving environments. AVs are expected to have human-like driving behavior to seamlessly integrate into human-dominated traffic systems. To address this issue, we propose a reinforcement learning framework that considers driving priors and Social Coordination Awareness (SCA) to optimize the behavior of AVs. The framework integrates a driving prior learning (DPL) model based on a variational autoencoder to infer the driver's driving priors from human drivers' trajectories. A policy network based on a multi-head attention mechanism is designed to effectively capture the interactive dependencies between AVs and other traffic participants to improve decision-making quality. The introduction of SCA into the autonomous driving decision-making system, and the use of Coordination Tendency (CT) to quantify the willingness of AVs to coordinate the traffic system is explored. Simulation results show that the proposed framework can not only improve the decision-making quality of AVs but also motivate them to produce social behaviors, with potential benefits for the safety and traffic efficiency of the entire transportation system.

Via

Access Paper or Ask Questions

Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Aug 31, 2023

Cuican Yu, Guansong Lu, Yihan Zeng, Jian Sun, Xiaodan Liang, Huibin Li, Zongben Xu, Songcen Xu, Wei Zhang, Hang Xu

Figure 1 for Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Figure 2 for Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Figure 3 for Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Figure 4 for Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Abstract:Generating 3D faces from textual descriptions has a multitude of applications, such as gaming, movie, and robotics. Recent progresses have demonstrated the success of unconditional 3D face generation and text-to-3D shape generation. However, due to the limited text-3D face data pairs, text-driven 3D face generation remains an open problem. In this paper, we propose a text-guided 3D faces generation method, refer as TG-3DFace, for generating realistic 3D faces using text guidance. Specifically, we adopt an unconditional 3D face generation framework and equip it with text conditions, which learns the text-guided 3D face generation with only text-2D face data. On top of that, we propose two text-to-face cross-modal alignment techniques, including the global contrastive learning and the fine-grained alignment module, to facilitate high semantic consistency between generated 3D faces and input texts. Besides, we present directional classifier guidance during the inference process, which encourages creativity for out-of-domain generations. Compared to the existing methods, TG-3DFace creates more realistic and aesthetically pleasing 3D faces, boosting 9% multi-view consistency (MVIC) over Latent3D. The rendered face images generated by TG-3DFace achieve higher FID and CLIP score than text-to-2D face/image generation models, demonstrating our superiority in generating realistic and semantic-consistent textures.

* accepted by ICCV 2023

Via

Access Paper or Ask Questions

Soft Decomposed Policy-Critic: Bridging the Gap for Effective Continuous Control with Discrete RL

Aug 20, 2023

Yechen Zhang, Jian Sun, Gang Wang, Zhuo Li, Wei Chen

Abstract:Discrete reinforcement learning (RL) algorithms have demonstrated exceptional performance in solving sequential decision tasks with discrete action spaces, such as Atari games. However, their effectiveness is hindered when applied to continuous control problems due to the challenge of dimensional explosion. In this paper, we present the Soft Decomposed Policy-Critic (SDPC) architecture, which combines soft RL and actor-critic techniques with discrete RL methods to overcome this limitation. SDPC discretizes each action dimension independently and employs a shared critic network to maximize the soft $Q$-function. This novel approach enables SDPC to support two types of policies: decomposed actors that lead to the Soft Decomposed Actor-Critic (SDAC) algorithm, and decomposed $Q$-networks that generate Boltzmann soft exploration policies, resulting in the Soft Decomposed-Critic Q (SDCQ) algorithm. Through extensive experiments, we demonstrate that our proposed approach outperforms state-of-the-art continuous RL algorithms in a variety of continuous control tasks, including Mujoco's Humanoid and Box2d's BipedalWalker. These empirical results validate the effectiveness of the SDPC architecture in addressing the challenges associated with continuous control.

Via

Access Paper or Ask Questions

MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

Jul 30, 2023

Jiaqi Liu, Peng Hang, Xiao qi, Jianqiang Wang, Jian Sun

Figure 1 for MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

Figure 2 for MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

Figure 3 for MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

Figure 4 for MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

Abstract:Autonomous driving technology is poised to transform transportation systems. However, achieving safe and accurate multi-task decision-making in complex scenarios, such as unsignalized intersections, remains a challenge for autonomous vehicles. This paper presents a novel approach to this issue with the development of a Multi-Task Decision-Making Generative Pre-trained Transformer (MTD-GPT) model. Leveraging the inherent strengths of reinforcement learning (RL) and the sophisticated sequence modeling capabilities of the Generative Pre-trained Transformer (GPT), the MTD-GPT model is designed to simultaneously manage multiple driving tasks, such as left turns, straight-ahead driving, and right turns at unsignalized intersections. We initially train a single-task RL expert model, sample expert data in the environment, and subsequently utilize a mixed multi-task dataset for offline GPT training. This approach abstracts the multi-task decision-making problem in autonomous driving as a sequence modeling task. The MTD-GPT model is trained and evaluated across several decision-making tasks, demonstrating performance that is either superior or comparable to that of state-of-the-art single-task decision-making models.

* Accepted by ITSC 2023

Via

Access Paper or Ask Questions