Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Heisei Yonezawa

Continual uncertainty learning

Feb 19, 2026

Heisei Yonezawa, Ansei Yonezawa, Itsuro Kajiwara

Abstract:Robust control of mechanical systems with multiple uncertainties remains a fundamental challenge, particularly when nonlinear dynamics and operating-condition variations are intricately intertwined. While deep reinforcement learning (DRL) combined with domain randomization has shown promise in mitigating the sim-to-real gap, simultaneously handling all sources of uncertainty often leads to sub-optimal policies and poor learning efficiency. This study formulates a new curriculum-based continual learning framework for robust control problems involving nonlinear dynamical systems in which multiple sources of uncertainty are simultaneously superimposed. The key idea is to decompose a complex control problem with multiple uncertainties into a sequence of continual learning tasks, in which strategies for handling each uncertainty are acquired sequentially. The original system is extended into a finite set of plants whose dynamic uncertainties are gradually expanded and diversified as learning progresses. The policy is stably updated across the entire plant sets associated with tasks defined by different uncertainty configurations without catastrophic forgetting. To ensure learning efficiency, we jointly incorporate a model-based controller (MBC), which guarantees a shared baseline performance across the plant sets, into the learning process to accelerate the convergence. This residual learning scheme facilitates task-specific optimization of the DRL agent for each uncertainty, thereby enhancing sample efficiency. As a practical industrial application, this study applies the proposed method to designing an active vibration controller for automotive powertrains. We verified that the resulting controller is robust against structural nonlinearities and dynamic variations, realizing successful sim-to-real transfer.

Via

Access Paper or Ask Questions

Model-based controller assisted domain randomization in deep reinforcement learning: application to nonlinear powertrain control

Apr 28, 2025

Heisei Yonezawa, Ansei Yonezawa, Itsuro Kajiwara

Figure 1 for Model-based controller assisted domain randomization in deep reinforcement learning: application to nonlinear powertrain control

Figure 2 for Model-based controller assisted domain randomization in deep reinforcement learning: application to nonlinear powertrain control

Figure 3 for Model-based controller assisted domain randomization in deep reinforcement learning: application to nonlinear powertrain control

Figure 4 for Model-based controller assisted domain randomization in deep reinforcement learning: application to nonlinear powertrain control

Abstract:Complex mechanical systems such as vehicle powertrains are inherently subject to multiple nonlinearities and uncertainties arising from parametric variations. Modeling and calibration errors are therefore unavoidable, making the transfer of control systems from simulation to real-world systems a critical challenge. Traditional robust controls have limitations in handling certain types of nonlinearities and uncertainties, requiring a more practical approach capable of comprehensively compensating for these various constraints. This study proposes a new robust control approach using the framework of deep reinforcement learning (DRL). The key strategy lies in the synergy among domain randomization-based DRL, long short-term memory (LSTM)-based actor and critic networks, and model-based control (MBC). The problem setup is modeled via the latent Markov decision process (LMDP), a set of vanilla MDPs, for a controlled system subject to uncertainties and nonlinearities. In LMDP, the dynamics of an environment simulator is randomized during training to improve the robustness of the control system to real testing environments. The randomization increases training difficulties as well as conservativeness of the resultant control system; therefore, progress is assisted by concurrent use of a model-based controller based on a nominal system model. Compared to traditional DRL-based controls, the proposed controller design is smarter in that we can achieve a high level of generalization ability with a more compact neural network architecture and a smaller amount of training data. The proposed approach is verified via practical application to active damping for a complex powertrain system with nonlinearities and parametric variations. Comparative tests demonstrate the high robustness of the proposed approach.

Via

Access Paper or Ask Questions

Simple inverse kinematics computation considering joint motion efficiency

Mar 29, 2024

Ansei Yonezawa, Heisei Yonezawa, Itsuro Kajiwara

Figure 1 for Simple inverse kinematics computation considering joint motion efficiency

Figure 2 for Simple inverse kinematics computation considering joint motion efficiency

Figure 3 for Simple inverse kinematics computation considering joint motion efficiency

Figure 4 for Simple inverse kinematics computation considering joint motion efficiency

Abstract:Inverse kinematics is an important and challenging problem in the operation of industrial manipulators. This study proposes a simple inverse kinematics calculation scheme for an industrial serial manipulator. The proposed technique can calculate appropriate values of the joint variables to realize the desired end-effector position and orientation while considering the motion costs of each joint. Two scalar functions are defined for the joint variables: one is to evaluate the end-effector position and orientation, whereas the other is to evaluate the motion efficiency of the joints. By combining the two scalar functions, the inverse kinematics calculation of the manipulator is formulated as a numerical optimization problem. Furthermore, a simple algorithm for solving the inverse kinematics via the aforementioned optimization is constructed on the basis of the simultaneous perturbation stochastic approximation with a norm-limited update vector (NLSPSA). The proposed scheme considers not only the accuracy of the position and orientation of the end-effector but also the efficiency of the robot movement. Therefore, it yields a practical result of the inverse problem. Moreover, the proposed algorithm is simple and easy to implement owing to the high calculation efficiency of NLSPSA. Finally, the effectiveness of the proposed method is verified through numerical examples using a redundant manipulator.

* Published in IEEE Transactions on Cybernetics (doi: 10.1109/TCYB.2024.3372989)

Via

Access Paper or Ask Questions