Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lokesh Kumar

Gesture2Speech: How Far Can Hand Movements Shape Expressive Speech?

Mar 20, 2026

Lokesh Kumar, Nirmesh Shah, Ashishkumar P. Gudmalwar, Pankaj Wasnik

Abstract:Human communication seamlessly integrates speech and bodily motion, where hand gestures naturally complement vocal prosody to express intent, emotion, and emphasis. While recent text-to-speech (TTS) systems have begun incorporating multimodal cues such as facial expressions or lip movements, the role of hand gestures in shaping prosody remains largely underexplored. We propose a novel multimodal TTS framework, Gesture2Speech, that leverages visual gesture cues to modulate prosody in synthesized speech. Motivated by the observation that confident and expressive speakers coordinate gestures with vocal prosody, we introduce a multimodal Mixture-of-Experts (MoE) architecture that dynamically fuses linguistic content and gesture features within a dedicated style extraction module. The fused representation conditions an LLM-based speech decoder, enabling prosodic modulation that is temporally aligned with hand movements. We further design a gesture-speech alignment loss that explicitly models their temporal correspondence to ensure fine-grained synchrony between gestures and prosodic contours. Evaluations on the PATS dataset show that Gesture2Speech outperforms state-of-the-art baselines in both speech naturalness and gesture-speech synchrony. To the best of our knowledge, this is the first work to utilize hand gesture cues for prosody control in neural speech synthesis. Demo samples are available at https://research.sri-media-analysis.com/aaai26-beeu-gesture2speech/

* Accepted at The 2nd International Workshop on Bodily Expressed Emotion Understanding (BEEU) at AAAI 2026 [non-archival]

Via

Access Paper or Ask Questions

Enhancing Efficiency of Quadrupedal Locomotion over Challenging Terrains with Extensible Feet

May 03, 2023

Lokesh Kumar, Sarvesh Sortee, Titas Bera, Ranjan Dasgupta

Figure 1 for Enhancing Efficiency of Quadrupedal Locomotion over Challenging Terrains with Extensible Feet

Figure 2 for Enhancing Efficiency of Quadrupedal Locomotion over Challenging Terrains with Extensible Feet

Figure 3 for Enhancing Efficiency of Quadrupedal Locomotion over Challenging Terrains with Extensible Feet

Figure 4 for Enhancing Efficiency of Quadrupedal Locomotion over Challenging Terrains with Extensible Feet

Abstract:Recent advancements in legged locomotion research have made legged robots a preferred choice for navigating challenging terrains when compared to their wheeled counterparts. This paper presents a novel locomotion policy, trained using Deep Reinforcement Learning, for a quadrupedal robot equipped with an additional prismatic joint between the knee and foot of each leg. The training is performed in NVIDIA Isaac Gym simulation environment. Our study investigates the impact of these joints on maintaining the quadruped's desired height and following commanded velocities while traversing challenging terrains. We provide comparison results, based on a Cost of Transport (CoT) metric, between quadrupeds with and without prismatic joints. The learned policy is evaluated on a set of challenging terrains using the CoT metric in simulation. Our results demonstrate that the added degrees of actuation offer the locomotion policy more flexibility to use the extra joints to traverse terrains that would be deemed infeasible or prohibitively expensive for the conventional quadrupedal design, resulting in significantly improved efficiency.

* Submitted to the 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC). 6 pages, 8 figures

Via

Access Paper or Ask Questions

On the Collaborative Object Transportation Using Leader Follower Approach

May 02, 2023

Sumanta Ghosh, Subhajit Nath, Sarvesh Sortee, Lokesh Kumar, Titas Bera

Abstract:In this paper we address the multi-agent collaborative object transportation problem in a partially known environment with obstacles under a specified goal condition. We propose a leader follower approach for two mobile manipulators collaboratively transporting an object along specified desired trajectories. The proposed approach treats the mobile manipulation system as two independent subsystems: a mobile platform and a manipulator arm and uses their kinematics model for trajectory tracking. In this work we considered that the mobile platform is subject to non-holonomic constraints, with a manipulator carrying a rigid load. The desired trajectories of the end points of the load are obtained from Probabilistic RoadMap-based planning approach. Our method combines Proportional Navigation Guidance-based approach with a proposed Stop-and-Sync algorithm to reach sufficiently close to the desired trajectory, the deviation due to the non-holonomic constraints is compensated by the manipulator arm. A leader follower approach for computing inverse kinematics solution for the position of the end-effector of the manipulator arm is proposed to maintain the load rigidity. Further, we compare the proposed approach with other approaches to analyse the efficacy of our algorithm.

Via

Access Paper or Ask Questions

A Levy Flight based Narrow Passage Sampling Method for Probabilistic Roadmap Planners

Jul 02, 2021

Shubham Shukla, Lokesh Kumar, Titas Bera, Ranjan Dasgupta

Figure 1 for A Levy Flight based Narrow Passage Sampling Method for Probabilistic Roadmap Planners

Figure 2 for A Levy Flight based Narrow Passage Sampling Method for Probabilistic Roadmap Planners

Figure 3 for A Levy Flight based Narrow Passage Sampling Method for Probabilistic Roadmap Planners

Figure 4 for A Levy Flight based Narrow Passage Sampling Method for Probabilistic Roadmap Planners

Abstract:Sampling based probabilistic roadmap planners (PRM) have been successful in motion planning of robots with higher degrees of freedom, but may fail to capture the connectivity of the configuration space in scenarios with a critical narrow passage. In this paper, we show a novel technique based on Levy Flights to generate key samples in the narrow regions of configuration space, which, when combined with a PRM, improves the completeness of the planner. The technique substantially improves sample quality at the expense of a minimal additional computation, when compared with pure random walk based methods, however, still outperforms state of the art random bridge building method, in terms of number of collision calls, computational overhead and sample quality. The method is robust to the changes in the parameters related to the structure of the narrow passage, thus giving an additional generality. A number of 2D & 3D motion planning simulations are presented which shows the effectiveness of the method.

Via

Access Paper or Ask Questions