Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Quentin Rouxel

CUHK

CaFe-TeleVision: A Coarse-to-Fine Teleoperation System with Immersive Situated Visualization for Enhanced Ergonomics

Dec 17, 2025

Zixin Tang, Yiming Chen, Quentin Rouxel, Dianxi Li, Shuang Wu, Fei Chen

Abstract:Teleoperation presents a promising paradigm for remote control and robot proprioceptive data collection. Despite recent progress, current teleoperation systems still suffer from limitations in efficiency and ergonomics, particularly in challenging scenarios. In this paper, we propose CaFe-TeleVision, a coarse-to-fine teleoperation system with immersive situated visualization for enhanced ergonomics. At its core, a coarse-to-fine control mechanism is proposed in the retargeting module to bridge workspace disparities, jointly optimizing efficiency and physical ergonomics. To stream immersive feedback with adequate visual cues for human vision systems, an on-demand situated visualization technique is integrated in the perception module, which reduces the cognitive load for multi-view processing. The system is built on a humanoid collaborative robot and validated with six challenging bimanual manipulation tasks. User study among 24 participants confirms that CaFe-TeleVision enhances ergonomics with statistical significance, indicating a lower task load and a higher user acceptance during teleoperation. Quantitative results also validate the superior performance of our system across six tasks, surpassing comparative methods by up to 28.89% in success rate and accelerating by 26.81% in completion time. Project webpage: https://clover-cuhk.github.io/cafe_television/

* Project webpage: https://clover-cuhk.github.io/cafe_television/ Code: https://github.com/Zixin-Tang/CaFe-TeleVision

Via

Access Paper or Ask Questions

Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion

Nov 18, 2025

Zhuo Li, Junjia Liu, Zhipeng Dong, Tao Teng, Quentin Rouxel, Darwin Caldwell, Fei Chen

Abstract:Vision-Language-Action (VLA) models have demonstrated significant potential in real-world robotic manipulation. However, pre-trained VLA policies still suffer from substantial performance degradation during downstream deployment. Although fine-tuning can mitigate this issue, its reliance on costly demonstration collection and intensive computation makes it impractical in real-world settings. In this work, we introduce VLA-Pilot, a plug-and-play inference-time policy steering method for zero-shot deployment of pre-trained VLA without any additional fine-tuning or data collection. We evaluate VLA-Pilot on six real-world downstream manipulation tasks across two distinct robotic embodiments, encompassing both in-distribution and out-of-distribution scenarios. Experimental results demonstrate that VLA-Pilot substantially boosts the success rates of off-the-shelf pre-trained VLA policies, enabling robust zero-shot generalization to diverse tasks and embodiments. Experimental videos and code are available at: https://rip4kobe.github.io/vla-pilot/.

* 9 pages, 8 figures, submitted to IEEE RA-L

Via

Access Paper or Ask Questions

Extremum Flow Matching for Offline Goal Conditioned Reinforcement Learning

May 26, 2025

Quentin Rouxel, Clemente Donoso, Fei Chen, Serena Ivaldi, Jean-Baptiste Mouret

Abstract:Imitation learning is a promising approach for enabling generalist capabilities in humanoid robots, but its scaling is fundamentally constrained by the scarcity of high-quality expert demonstrations. This limitation can be mitigated by leveraging suboptimal, open-ended play data, often easier to collect and offering greater diversity. This work builds upon recent advances in generative modeling, specifically Flow Matching, an alternative to Diffusion models. We introduce a method for estimating the extremum of the learned distribution by leveraging the unique properties of Flow Matching, namely, deterministic transport and support for arbitrary source distributions. We apply this method to develop several goal-conditioned imitation and reinforcement learning algorithms based on Flow Matching, where policies are conditioned on both current and goal observations. We explore and compare different architectural configurations by combining core components, such as critic, planner, actor, or world model, in various ways. We evaluated our agents on the OGBench benchmark and analyzed how different demonstration behaviors during data collection affect performance in a 2D non-prehensile pushing task. Furthermore, we validated our approach on real hardware by deploying it on the Talos humanoid robot to perform complex manipulation tasks based on high-dimensional image observations, featuring a sequence of pick-and-place and articulated object manipulation in a realistic kitchen environment. Experimental videos and code are available at: https://hucebot.github.io/extremum_flow_matching_website/

Via

Access Paper or Ask Questions

From Vocal Instructions to Household Tasks: The Inria Tiago++ in the euROBIN Service Robots Coopetition

Dec 20, 2024

Fabio Amadio, Clemente Donoso, Dionis Totsila, Raphael Lorenzo, Quentin Rouxel, Olivier Rochel, Enrico Mingo Hoffman, Jean-Baptiste Mouret, Serena Ivaldi

Abstract:This paper describes the Inria team's integrated robotics system used in the 1st euROBIN coopetition, during which service robots performed voice-activated household tasks in a kitchen setting.The team developed a modified Tiago++ platform that leverages a whole-body control stack for autonomous and teleoperated modes, and an LLM-based pipeline for instruction understanding and task planning. The key contributions (opens-sourced) are the integration of these components and the design of custom teleoperation devices, addressing practical challenges in the deployment of service robots.

Via

Access Paper or Ask Questions

Words2Contact: Identifying Support Contacts from Verbal Instructions Using Foundation Models

Jul 19, 2024

Dionis Totsila, Quentin Rouxel, Jean-Baptiste Mouret, Serena Ivaldi

Figure 1 for Words2Contact: Identifying Support Contacts from Verbal Instructions Using Foundation Models

Figure 2 for Words2Contact: Identifying Support Contacts from Verbal Instructions Using Foundation Models

Figure 3 for Words2Contact: Identifying Support Contacts from Verbal Instructions Using Foundation Models

Figure 4 for Words2Contact: Identifying Support Contacts from Verbal Instructions Using Foundation Models

Abstract:This paper presents Words2Contact, a language-guided multi-contact placement pipeline leveraging large language models and vision language models. Our method is a key component for language-assisted teleoperation and human-robot cooperation, where human operators can instruct the robots where to place their support contacts before whole-body reaching or manipulation using natural language. Words2Contact transforms the verbal instructions of a human operator into contact placement predictions; it also deals with iterative corrections, until the human is satisfied with the contact location identified in the robot's field of view. We benchmark state-of-the-art LLMs and VLMs for size and performance in contact prediction. We demonstrate the effectiveness of the iterative correction process, showing that users, even naive, quickly learn how to instruct the system to obtain accurate locations. Finally, we validate Words2Contact in real-world experiments with the Talos humanoid robot, instructed by human operators to place support contacts on different locations and surfaces to avoid falling when reaching for distant objects.

Via

Access Paper or Ask Questions

Flow Matching Imitation Learning for Multi-Support Manipulation

Jul 17, 2024

Quentin Rouxel, Andrea Ferrari, Serena Ivaldi, Jean-Baptiste Mouret

Figure 1 for Flow Matching Imitation Learning for Multi-Support Manipulation

Figure 2 for Flow Matching Imitation Learning for Multi-Support Manipulation

Figure 3 for Flow Matching Imitation Learning for Multi-Support Manipulation

Figure 4 for Flow Matching Imitation Learning for Multi-Support Manipulation

Abstract:Humanoid robots could benefit from using their upper bodies for support contacts, enhancing their workspace, stability, and ability to perform contact-rich and pushing tasks. In this paper, we propose a unified approach that combines an optimization-based multi-contact whole-body controller with Flow Matching, a recently introduced method capable of generating multi-modal trajectory distributions for imitation learning. In simulation, we show that Flow Matching is more appropriate for robotics than Diffusion and traditional behavior cloning. On a real full-size humanoid robot (Talos), we demonstrate that our approach can learn a whole-body non-prehensile box-pushing task and that the robot can close dishwasher drawers by adding contacts with its free hand when needed for balance. We also introduce a shared autonomy mode for assisted teleoperation, providing automatic contact placement for tasks not covered in the demonstrations. Full experimental videos are available at: https://hucebot.github.io/flow_multisupport_website/

Via

Access Paper or Ask Questions

Multi-Contact Whole Body Force Control for Position-Controlled Robots

Jan 16, 2024

Quentin Rouxel, Serena Ivaldi, Jean-Baptiste Mouret

Abstract:Many humanoid and multi-legged robots are controlled in positions rather than in torques, preventing direct control of contact forces, and hampering their ability to create multiple contacts to enhance their balance, such as placing a hand on a wall or a handrail. This paper introduces the SEIKO (Sequential Equilibrium Inverse Kinematic Optimization) pipeline, drawing inspiration from flexibility models used in serial elastic actuators to indirectly control contact forces on traditional position-controlled robots. SEIKO formulates whole-body retargeting from Cartesian commands and admittance control using two quadratic programs solved in real time. We validated our pipeline with experiments on the real, full-scale humanoid robot Talos in various multicontact scenarios, including pushing tasks, far-reaching tasks, stair climbing, and stepping on sloped surfaces. This work opens the possibility of stable, contact-rich behaviors while getting around many of the challenges of torque-controlled robots. Code and videos are available at https://hucebot.github.io/seiko_controller_website/ .

Via

Access Paper or Ask Questions

Feasibility Retargeting for Multi-contact Teleoperation and Physical Interaction

Aug 07, 2023

Quentin Rouxel, Ruoshi Wen, Zhibin Li, Carlo Tiseo, Jean-Baptiste Mouret, Serena Ivaldi

Figure 1 for Feasibility Retargeting for Multi-contact Teleoperation and Physical Interaction

Figure 2 for Feasibility Retargeting for Multi-contact Teleoperation and Physical Interaction

Figure 3 for Feasibility Retargeting for Multi-contact Teleoperation and Physical Interaction

Figure 4 for Feasibility Retargeting for Multi-contact Teleoperation and Physical Interaction

Abstract:This short paper outlines two recent works on multi-contact teleoperation and the development of the SEIKO (Sequential Equilibrium Inverse Kinematic Optimization) framework. SEIKO adapts commands from the operator in real-time and ensures that the reference configuration sent to the underlying controller is feasible. Additionally, an admittance scheme is used to implement physical interaction, which is then combined with the operator's command and retargeted. SEIKO has been applied in simulations on various robots, including humanoid and quadruped robots designed for loco-manipulation. Furthermore, SEIKO has been tested on real hardware for bimanual heavy object carrying tasks.

* 2nd Workshop Toward Robot Avatars, 2023 IEEE International Conference on Robotics and Automation (ICRA), Jun 2023, London, United Kingdom

Via

Access Paper or Ask Questions

Achieving Dexterous Bidirectional Interaction in Uncertain Conditions for Medical Robotics

Jun 20, 2022

Carlo Tiseo, Quentin Rouxel, Martin Asenov, Keyhan Kouhkiloui Babarahmati, Subramanian Ramamoorthy, Zhibin Li, Michael Mistry

Figure 1 for Achieving Dexterous Bidirectional Interaction in Uncertain Conditions for Medical Robotics

Figure 2 for Achieving Dexterous Bidirectional Interaction in Uncertain Conditions for Medical Robotics

Figure 3 for Achieving Dexterous Bidirectional Interaction in Uncertain Conditions for Medical Robotics

Figure 4 for Achieving Dexterous Bidirectional Interaction in Uncertain Conditions for Medical Robotics

Abstract:Medical robotics can help improve and extend the reach of healthcare services. A major challenge for medical robots is the complex physical interaction between the robot and the patients which is required to be safe. This work presents the preliminary evaluation of a recently introduced control architecture based on the Fractal Impedance Control (FIC) in medical applications. The deployed FIC architecture is robust to delay between the master and the replica robots. It can switch online between an admittance and impedance behaviour, and it is robust to interaction with unstructured environments. Our experiments analyse three scenarios: teleoperated surgery, rehabilitation, and remote ultrasound scan. The experiments did not require any adjustment of the robot tuning, which is essential in medical applications where the operators do not have an engineering background required to tune the controller. Our results show that is possible to teleoperate the robot to cut using a scalpel, do an ultrasound scan, and perform remote occupational therapy. However, our experiments also highlighted the need for a better robots embodiment to precisely control the system in 3D dynamic tasks.

* video: https://youtu.be/G5NfFbh_ULg

Via

Access Paper or Ask Questions

Multi-Contact Motion Retargeting using Whole-body Optimization of Full Kinematics and Sequential Force Equilibrium

Jun 01, 2022

Quentin Rouxel, Kai Yuan, Ruoshi Wen, Zhibin Li

Figure 1 for Multi-Contact Motion Retargeting using Whole-body Optimization of Full Kinematics and Sequential Force Equilibrium

Figure 2 for Multi-Contact Motion Retargeting using Whole-body Optimization of Full Kinematics and Sequential Force Equilibrium

Figure 3 for Multi-Contact Motion Retargeting using Whole-body Optimization of Full Kinematics and Sequential Force Equilibrium

Figure 4 for Multi-Contact Motion Retargeting using Whole-body Optimization of Full Kinematics and Sequential Force Equilibrium

Abstract:This paper presents a multi-contact motion adaptation framework that enables teleoperation of high degree-of-freedom (DoF) robots, such as quadrupeds and humanoids, for loco-manipulation tasks in multi-contact settings. Our proposed algorithms optimize whole-body configurations and formulate the retargeting of multi-contact motions as sequential quadratic programming, which is robust and stable near the edges of feasibility constraints. Our framework allows real-time operation of the robot and reduces cognitive load for the operator because infeasible commands are automatically adapted into physically stable and viable motions on the robot. The results in simulations with full dynamics demonstrated the effectiveness of teleoperating different legged robots interactively and generating rich multi-contact movements. We evaluated the computational efficiency of the proposed algorithms, and further validated and analyzed multi-contact loco-manipulation tasks on humanoid and quadruped robots by reaching, active pushing and various traversal on uneven terrains.

* IEEE/ASME Transactions on Mechatronics, 2022

Via

Access Paper or Ask Questions