Discrete-choice models are a powerful framework for analyzing decision-making behavior to provide valuable insights for policymakers and businesses. Multinomial logit models (MNLs) with linear utility functions have been used in practice because they are ease to use and interpretable. Recently, MNLs with neural networks (e.g., ASU-DNN) have been developed, and they have achieved higher prediction accuracy in behavior choice than classical MNLs. However, these models lack interpretability owing to complex structures. We developed utility functions with a novel neural-network architecture based on generalized additive models, named generalized additive utility network ( GAUNet), for discrete-choice models. We evaluated the performance of the MNL with GAUNet using the trip survey data collected in Tokyo. Our models were comparable to ASU-DNN in accuracy and exhibited improved interpretability compared to previous models.
Freeway merging in congested traffic is a significant challenge toward fully automated driving. Merging vehicles need to decide not only how to merge into a spot, but also where to merge. We present a method for the freeway merging based on multi-policy decision making with a reinforcement learning method called {\em passive actor-critic} (pAC), which learns with less knowledge of the system and without active exploration. The method selects a merging spot candidate by using the state value learned with pAC. We evaluate our method using real traffic data. Our experiments show that pAC achieves 92\% success rate to merge into a freeway, which is comparable to human decision making.
In many robotic applications, some aspects of the system dynamics can be modeled accurately while others are difficult to obtain or model. We present a novel reinforcement learning (RL) method for continuous state and action spaces that learns with partial knowledge of the system and without active exploration. It solves linearly-solvable Markov decision processes (L-MDPs), which are well suited for continuous state and action spaces, based on an actor-critic architecture. Compared to previous RL methods for L-MDPs and path integral methods which are model based, the actor-critic learning does not need a model of the uncontrolled dynamics and, importantly, transition noise levels; however, it requires knowing the control dynamics for the problem. We evaluate our method on two synthetic test problems, and one real-world problem in simulation and using real traffic data. Our experiments demonstrate improved learning and policy performance.