Platooning and coordination are two implementation strategies that are frequently proposed for traffic control of connected and autonomous vehicles (CAVs) at signal-free intersections instead of using conventional traffic signals. However, few studies have attempted to integrate both strategies to better facilitate the CAV control at signal-free intersections. To this end, this study proposes a hierarchical control model, named COOR-PLT, to coordinate adaptive CAV platoons at a signal-free intersection based on deep reinforcement learning (DRL). COOR-PLT has a two-layer framework. The first layer uses a centralized control strategy to form adaptive platoons. The optimal size of each platoon is determined by considering multiple objectives (i.e., efficiency, fairness and energy saving). The second layer employs a decentralized control strategy to coordinate multiple platoons passing through the intersection. Each platoon is labeled with coordinated status or independent status, upon which its passing priority is determined. As an efficient DRL algorithm, Deep Q-network (DQN) is adopted to determine platoon sizes and passing priorities respectively in the two layers. The model is validated and examined on the simulator Simulation of Urban Mobility (SUMO). The simulation results demonstrate that the model is able to: (1) achieve satisfactory convergence performances; (2) adaptively determine platoon size in response to varying traffic conditions; and (3) completely avoid deadlocks at the intersection. By comparison with other control methods, the model manifests its superiority of adopting adaptive platooning and DRL-based coordination strategies. Also, the model outperforms several state-of-the-art methods on reducing travel time and fuel consumption in different traffic conditions.
Meta learning has demonstrated tremendous success in few-shot learning with limited supervised data. In those settings, the meta model is usually overparameterized. While the conventional statistical learning theory suggests that overparameterized models tend to overfit, empirical evidence reveals that overparameterized meta learning methods still work well -- a phenomenon often called ``benign overfitting.'' To understand this phenomenon, we focus on the meta learning settings with a challenging nested structure that we term the nested meta learning, and analyze its generalization performance under an overparameterized meta learning model. While our analysis uses the relatively tractable linear models, our theory contributes to understanding the delicate interplay among data heterogeneity, model adaptation and benign overfitting in nested meta learning tasks. We corroborate our theoretical claims through numerical simulations.
As a strategy to reduce travel delay and enhance energy efficiency, platooning of connected and autonomous vehicles (CAVs) at non-signalized intersections has become increasingly popular in academia. However, few studies have attempted to model the relation between the optimal platoon size and the traffic conditions around the intersection. To this end, this study proposes an adaptive platoon based autonomous intersection control model powered by deep reinforcement learning (DRL) technique. The model framework has following two levels: the first level adopts a First Come First Serve (FCFS) reservation based policy integrated with a nonconflicting lane selection mechanism to determine vehicles' passing priority; and the second level applies a deep Q-network algorithm to identify the optimal platoon size based on the real-time traffic condition of an intersection. When being tested on a traffic micro-simulator, our proposed model exhibits superior performances on travel efficiency and fuel conservation as compared to the state-of-the-art methods.
Stochastic approximation (SA) with multiple coupled sequences has found broad applications in machine learning such as bilevel learning and reinforcement learning (RL). In this paper, we study the finite-time convergence of nonlinear SA with multiple coupled sequences. Different from existing multi-timescale analysis, we seek for scenarios where a fine-grained analysis can provide the tight performance guarantee for multi-sequence single-timescale SA (STSA). At the heart of our analysis is the smoothness property of the fixed points in multi-sequence SA that holds in many applications. When all sequences have strongly monotone increments, we establish the iteration complexity of $\mathcal{O}(\epsilon^{-1})$ to achieve $\epsilon$-accuracy, which improves the existing $\mathcal{O}(\epsilon^{-1.5})$ complexity for two coupled sequences. When all but the main sequence have strongly monotone increments, we establish the iteration complexity of $\mathcal{O}(\epsilon^{-2})$. The merit of our results lies in that applying them to stochastic bilevel and compositional optimization problems, as well as RL problems leads to either relaxed assumptions or improvements over their existing performance guarantees.
A major challenge of applying zeroth-order (ZO) methods is the high query complexity, especially when queries are costly. We propose a novel gradient estimation technique for ZO methods based on adaptive lazy queries that we term as LAZO. Different from the classic one-point or two-point gradient estimation methods, LAZO develops two alternative ways to check the usefulness of old queries from previous iterations, and then adaptively reuses them to construct the low-variance gradient estimates. We rigorously establish that through judiciously reusing the old queries, LAZO can reduce the variance of stochastic gradient estimates so that it not only saves queries per iteration but also achieves the regret bound for the symmetric two-point method. We evaluate the numerical performance of LAZO, and demonstrate the low-variance property and the performance gain of LAZO in both regret and query complexity relative to several existing ZO methods. The idea of LAZO is general, and can be applied to other variants of ZO methods.
Model-agnostic meta learning (MAML) is currently one of the dominating approaches for few-shot meta-learning. Albeit its effectiveness, the optimization of MAML can be challenging due to the innate bilevel problem structure. Specifically, the loss landscape of MAML is much more complex with possibly more saddle points and local minimizers than its empirical risk minimization counterpart. To address this challenge, we leverage the recently invented sharpness-aware minimization and develop a sharpness-aware MAML approach that we term Sharp-MAML. We empirically demonstrate that Sharp-MAML and its computation-efficient variant can outperform popular existing MAML baselines (e.g., $+12\%$ accuracy on Mini-Imagenet). We complement the empirical study with the convergence rate analysis and the generalization bound of Sharp-MAML. To the best of our knowledge, this is the first empirical and theoretical study on sharpness-aware minimization in the context of bilevel learning. The code is available at https://github.com/mominabbass/Sharp-MAML.
Federated learning (FL) can collaboratively train deep learning models using isolated patient data owned by different hospitals for various clinical applications, including medical image segmentation. However, a major problem of FL is its performance degradation when dealing with the data that are not independently and identically distributed (non-iid), which is often the case in medical images. In this paper, we first conduct a theoretical analysis on the FL algorithm to reveal the problem of model aggregation during training on non-iid data. With the insights gained through the analysis, we propose a simple and yet effective method, federated cross learning (FedCross), to tackle this challenging problem. Unlike the conventional FL methods that combine multiple individually trained local models on a server node, our FedCross sequentially trains the global model across different clients in a round-robin manner, and thus the entire training procedure does not involve any model aggregation steps. To further improve its performance to be comparable with the centralized learning method, we combine the FedCross with an ensemble learning mechanism to compose a federated cross ensemble learning (FedCrossEns) method. Finally, we conduct extensive experiments using a set of public datasets. The experimental results show that the proposed FedCross training strategy outperforms the mainstream FL methods on non-iid data. In addition to improving the segmentation performance, our FedCrossEns can further provide a quantitative estimation of the model uncertainty, demonstrating the effectiveness and clinical significance of our designs. Source code will be made publicly available after paper publication.
Meta learning aims at learning a model that can quickly adapt to unseen tasks. Widely used meta learning methods include model agnostic meta learning (MAML), implicit MAML, Bayesian MAML. Thanks to its ability of modeling uncertainty, Bayesian MAML often has advantageous empirical performance. However, the theoretical understanding of Bayesian MAML is still limited, especially on questions such as if and when Bayesian MAML has provably better performance than MAML. In this paper, we aim to provide theoretical justifications for Bayesian MAML's advantageous performance by comparing the meta test risks of MAML and Bayesian MAML. In the meta linear regression, under both the distribution agnostic and linear centroid cases, we have established that Bayesian MAML indeed has provably lower meta test risks than MAML. We verify our theoretical results through experiments.
Crime prediction is crucial for public safety and resource optimization, yet is very challenging due to two aspects: i) the dynamics of criminal patterns across time and space, crime events are distributed unevenly on both spatial and temporal domains; ii) time-evolving dependencies between different types of crimes (e.g., Theft, Robbery, Assault, Damage) which reveal fine-grained semantics of crimes. To tackle these challenges, we propose Spatial-Temporal Sequential Hypergraph Network (ST-SHN) to collectively encode complex crime spatial-temporal patterns as well as the underlying category-wise crime semantic relationships. In specific, to handle spatial-temporal dynamics under the long-range and global context, we design a graph-structured message passing architecture with the integration of the hypergraph learning paradigm. To capture category-wise crime heterogeneous relations in a dynamic environment, we introduce a multi-channel routing mechanism to learn the time-evolving structural dependency across crime types. We conduct extensive experiments on two real-world datasets, showing that our proposed ST-SHN framework can significantly improve the prediction performance as compared to various state-of-the-art baselines. The source code is available at: https://github.com/akaxlh/ST-SHN.