Abstract:Utilizing functional elements in an industrial environment, such as displays and interactive valves, provide effective possibilities for robot training. When preparing simulations for robots or applications that involve high-level scene understanding, the simulation environment must be equally detailed. Although CAD files for such environments deliver an exact description of the geometry and visuals, they usually lack semantic, relational and functional information, thus limiting the simulation and training possibilities. A 3D scene graph can organize semantic, spatial and functional information by enriching the environment through a Large Vision-Language Model (LVLM). In this paper we present an offline approach to creating detailed 3D scene graphs from CAD environments. This will serve as a foundation to include the relations of functional and actionable elements, which then can be used for dynamic simulation and reasoning. Key results of this research include both quantitative results of the generated semantic labels as well as qualitative results of the scene graph, especially in hindsight of pipe structures and identified functional relations. All code, results and the environment will be made available at https://cad-scenegraph.github.io
Abstract:Bridging the sim-to-real gap remains a fundamental challenge in robotics, as accurate dynamic parameter estimation is essential for reliable model-based control, realistic simulation, and safe deployment of manipulators. Traditional analytical approaches often fall short when faced with complex robot structures and interactions. Data-driven methods offer a promising alternative, yet conventional neural networks such as recurrent models struggle to capture long-range dependencies critical for accurate estimation. In this study, we propose a Transformer-based approach for dynamic parameter estimation, supported by an automated pipeline that generates diverse robot models and enriched trajectory data using Jacobian-derived features. The dataset consists of 8,192 robots with varied inertial and frictional properties. Leveraging attention mechanisms, our model effectively captures both temporal and spatial dependencies. Experimental results highlight the influence of sequence length, sampling rate, and architecture, with the best configuration (sequence length 64, 64 Hz, four layers, 32 heads) achieving a validation R2 of 0.8633. Mass and inertia are estimated with near-perfect accuracy, Coulomb friction with moderate-to-high accuracy, while viscous friction and distal link center-of-mass remain more challenging. These results demonstrate that combining Transformers with automated dataset generation and kinematic enrichment enables scalable, accurate dynamic parameter estimation, contributing to improved sim-to-real transfer in robotic systems
Abstract:There is a growing demand for teleoperated robots. This paper presents a novel method for reducing vibration noise generated by robot's own motion, which can disrupt the quality of tactile feedback for teleoperated robots. Our approach focuses on perceived intensity, the amount of how humans experience vibration, to create a noise filter that aligns with human perceptual characteristics. This system effectively subtracts ego-noise while preserving the essential tactile signals, ensuring more accurate and reliable haptic feedback for operators. This method offers a refined solution to the challenge of maintaining high-quality tactile feedback in teleoperated systems.
Abstract:This study addresses the challenge of low dexterity in teleoperation tasks caused by limited sensory feedback and visual occlusion. We propose a novel approach that integrates haptic feedback into teleoperation using the adaptive triggers of a commercially available DualSense controller. By adjusting button stiffness based on the proximity of objects to the robot's end effector, the system provides intuitive, real-time feedback to the operator. To achieve this, the effective volume of the end effector is virtually expanded, allowing the system to predict interactions by calculating overlap with nearby objects. This predictive capability is independent of the user's intent or the robot's speed, enhancing the operator's situational awareness without requiring complex pre-programmed behaviors. The stiffness of the adaptive triggers is adjusted in proportion to this overlapping volume, effectively conveying spatial proximity and movement cues through an "one degree of freedom" haptic feedback mechanism. Compared to existing solutions, this method reduces hardware requirements and computational complexity by using a geometric simplification approach, enabling efficient operation with minimal processing demands. Simulation results demonstrate that the proposed system reduces collision risk and improves user performance, offering an intuitive, precise, and safe teleoperation experience despite real-world uncertainties and communication delays.