Most reinforcement learning algorithms are inefficient for learning multiple tasks in complex robotic systems, where different tasks share a set of actions. In such environments a compound policy may be learnt with shared neural network parameters, which performs multiple tasks concurrently. However such compound policy may get biased towards a task or the gradients from different tasks negate each other, making the learning unstable and sometimes less data efficient. In this paper, we propose a new approach for simultaneous training of multiple tasks sharing a set of common actions in continuous action spaces, which we call as DiGrad (Differential Policy Gradient). The proposed framework is based on differential policy gradients and can accommodate multi-task learning in a single actor-critic network. We also propose a simple heuristic in the differential policy gradient update to further improve the learning. The proposed architecture was tested on 8 link planar manipulator and 27 degrees of freedom(DoF) Humanoid for learning multi-goal reachability tasks for 3 and 2 end effectors respectively. We show that our approach supports efficient multi-task learning in complex robotic systems, outperforming related methods in continuous action spaces.
Real time calculation of inverse kinematics (IK) with dynamically stable configuration is of high necessity in humanoid robots as they are highly susceptible to lose balance. This paper proposes a methodology to generate joint-space trajectories of stable configurations for solving inverse kinematics using Deep Reinforcement Learning (RL). Our approach is based on the idea of exploring the entire configuration space of the robot and learning the best possible solutions using Deep Deterministic Policy Gradient (DDPG). The proposed strategy was evaluated on the highly articulated upper body of a humanoid model with 27 degree of freedom (DoF). The trained model was able to solve inverse kinematics for the end effectors with 90% accuracy while maintaining the balance in double support phase.
This paper discusses the design of a novel compliant in-pipe climbing modular robot for small diameter pipes. The robot consists of a kinematic chain of 3 OmniCrawler modules with a link connected in between 2 adjacent modules via compliant joints. While the tank-like crawler mechanism provides good traction on low friction surfaces, its circular cross-section makes it holonomic. The holonomic motion assists it to re-align in a direction to avoid obstacles during motion as well as overcome turns with a minimal energy posture. Additionally, the modularity enables it to negotiate T-junction without motion singularity. The compliance is realized using 4 torsion springs incorporated in joints joining 3 modules with 2 links. For a desirable pipe diameter (\text{\O} 75mm), the springs' stiffness values are obtained by formulating a constraint optimization problem which has been simulated in ADAMS MSC and further validated on a real robot prototype. In order to negotiate smooth vertical bends and friction coefficient variations in pipes, the design was later modified by replacing springs with series elastic actuators (SEA) at 2 of the 4 joints.
This paper presents a modular in-pipeline climbing robot with a novel compliant foldable OmniCrawler mechanism. The circular cross-section of the OmniCrawler module enables a holonomic motion to facilitate the alignment of the robot in the direction of bends. Additionally, the crawler mechanism provides a fair amount of traction, even on slippery surfaces. These advantages of crawler modules have been further supplemented by incorporating active compliance in the module itself which helps to negotiate sharp bends in small diameter pipes. The robot has a series of 3 such compliant foldable modules interconnected by the links via passive joints. For the desirable pipe diameter and curvature of the bends, the spring stiffness value for each passive joint is determined by formulating a constrained optimization problem using the quasi-static model of the robot. Moreover, a minimum friction coefficient value between the module-pipe surface which can be vertically climbed by the robot without slipping is estimated. The numerical simulation results have further been validated by experiments on real robot prototype.