The diffusion-based Singing Voice Conversion (SVC) methods have achieved remarkable performances, producing natural audios with high similarity to the target timbre. However, the iterative sampling process results in slow inference speed, and acceleration thus becomes crucial. In this paper, we propose CoMoSVC, a consistency model-based SVC method, which aims to achieve both high-quality generation and high-speed sampling. A diffusion-based teacher model is first specially designed for SVC, and a student model is further distilled under self-consistency properties to achieve one-step sampling. Experiments on a single NVIDIA GTX4090 GPU reveal that although CoMoSVC has a significantly faster inference speed than the state-of-the-art (SOTA) diffusion-based SVC system, it still achieves comparable or superior conversion performance based on both subjective and objective metrics. Audio samples and codes are available at https://comosvc.github.io/.
In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, and the learned controllers possess verifiable properties like persistent feasibility and asymptotic stability akin to MPC. On the other hand, numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance and has superior robustness against modeling uncertainty and noises. Furthermore, the proposed controller is significantly more computationally efficient compared to MPC and requires fewer parameters to learn than MLP controllers. Real-world experiments on vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks.
Inertia drift is an aggressive transitional driving maneuver, which is challenging due to the high nonlinearity of the system and the stringent requirement on control and planning performance. This paper presents a solution for the consecutive inertia drift of an autonomous RC car based on primitive-based planning and data-driven control. The planner generates complex paths via the concatenation of path segments called primitives, and the controller eases the burden on feedback by interpolating between multiple real trajectories with different initial conditions into one near-feasible reference trajectory. The proposed strategy is capable of drifting through various paths containing consecutive turns, which is validated in both simulation and reality.
The Linear-Quadratic Regulation (LQR) problem with unknown system parameters has been widely studied, but it has remained unclear whether $\tilde{ \mathcal{O}}(\sqrt{T})$ regret, which is the best known dependence on time, can be achieved almost surely. In this paper, we propose an adaptive LQR controller with almost surely $\tilde{ \mathcal{O}}(\sqrt{T})$ regret upper bound. The controller features a circuit-breaking mechanism, which circumvents potential safety breach and guarantees the convergence of the system parameter estimate, but is shown to be triggered only finitely often and hence has negligible effect on the asymptotic performance of the controller. The proposed controller is also validated via simulation on Tennessee Eastman Process~(TEP), a commonly used industrial process example.
Drift control is significant to the safety of autonomous vehicles when there is a sudden loss of traction due to external conditions such as rain or snow. It is a challenging control problem due to the presence of significant sideslip and nearly full saturation of the tires. In this paper, we focus on the control of drift maneuvers following circular paths with either fixed or moving centers, subject to change in the tire-ground interaction, which are common training tasks for drift enthusiasts and can therefore be used as benchmarks of the performance of drift control. In order to achieve the above tasks, we propose a novel hierarchical control architecture which decouples the curvature and center control of the trajectory. In particular, an outer loop stabilizes the center by tuning the target curvature, and an inner loop tracks the curvature using a feedforward/feedback controller enhanced by an $\mathcal{L}_1$ adaptive component. The hierarchical architecture is flexible because the inner loop is task-agnostic and adaptive to changes in tire-road interaction, which allows the outer loop to be designed independent of low-level dynamics, opening up the possibility of incorporating sophisticated planning algorithms. We implement our control strategy on a simulation platform as well as on a 1/10 scale Radio-Control~(RC) car, and both the simulation and experiment results illustrate the effectiveness of our strategy in achieving the above described set of drift maneuvering tasks.
The control for aggressive driving of autonomous cars is challenging due to the presence of significant tyre slip. Data-driven and mechanism-based methods for the modeling and control of autonomous cars under aggressive driving conditions are limited in data efficiency and adaptability respectively. This paper is an attempt toward the fusion of the two classes of methods. By means of a modular design that is consisted of mechanism-based and data-driven components, and aware of the two-timescale phenomenon in the car model, our approach effectively improves over previous methods in terms of data efficiency, ability of transfer and final performance. The hybrid mechanism-and-data-driven approach is verified on TORCS (The Open Racing Car Simulator). Experiment results demonstrate the benefit of our approach over purely mechanism-based and purely data-driven methods.
This paper considers the data-driven linear-quadratic regulation (LQR) problem where the system parameters are unknown and need to be identified in real time. Contrary to existing system identification and data-driven control methods, which typically require either offline data collection or multiple resets, we propose an online non-episodic algorithm that gains knowledge about the system from a single trajectory. The algorithm guarantees that both the identification error and the suboptimality gap of control performance in this trajectory converge to zero almost surely. Furthermore, we characterize the almost sure convergence rates of identification and control, and reveal an optimal trade-off between exploration and exploitation. We provide a numerical example to illustrate the effectiveness of our proposed strategy.
Value-based methods of multi-agent reinforcement learning (MARL), especially the value decomposition methods, have been demonstrated on a range of challenging cooperative tasks. However, current methods pay little attention to the interaction between agents, which is essential to teamwork in games or real life. This limits the efficiency of value-based MARL algorithms in the two aspects: collaborative exploration and value function estimation. In this paper, we propose a novel cooperative MARL algorithm named as interactive actor-critic~(IAC), which models the interaction of agents from the perspectives of policy and value function. On the policy side, a multi-agent joint stochastic policy is introduced by adopting a collaborative exploration module, which is trained by maximizing the entropy-regularized expected return. On the value side, we use the shared attention mechanism to estimate the value function of each agent, which takes the impact of the teammates into consideration. At the implementation level, we extend the value decomposition methods to continuous control tasks and evaluate IAC on benchmark tasks including classic control and multi-agent particle environments. Experimental results indicate that our method outperforms the state-of-the-art approaches and achieves better performance in terms of cooperation.