In this work, we propose a trajectory optimization approach for robot navigation in cluttered 3D environments. We represent the robot's geometry as a semialgebraic set defined by polynomial inequalities such that robots with general shapes can be suitably characterized. To address the robot navigation task in obstacle-dense environments, we exploit the free space directly to construct a sequence of free regions, and allocate each waypoint on the trajectory to a specific region. Then, we incorporate a uniform scaling factor for each free region, and formulate a Sums-of-Squares (SOS) optimization problem that renders the containment relationship between the robot and the free space computationally tractable. The SOS optimization problem is further reformulated to a semidefinite program (SDP), and the collision-free constraints are shown to be equivalent to limiting the scaling factor along the entire trajectory. In this context, the robot at a specific configuration is tailored to stay within the free region. Next, to solve the trajectory optimization problem with the proposed safety constraints (which are implicitly dependent on the robot configurations), we derive the analytical solution to the gradient of the minimum scaling factor with respect to the robot configuration. As a result, this seamlessly facilitates the use of gradient-based methods in efficient solving of the trajectory optimization problem. Through a series of simulations and real-world experiments, the proposed trajectory optimization approach is validated in various challenging scenarios, and the results demonstrate its effectiveness in generating collision-free trajectories in dense and intricate environments populated with obstacles.
Multimodal Large Language Models (MLLMs) have shown impressive reasoning abilities and general intelligence in various domains. It inspires researchers to train end-to-end MLLMs or utilize large models to generate policies with human-selected prompts for embodied agents. However, these methods exhibit limited generalization capabilities on unseen tasks or scenarios, and overlook the multimodal environment information which is critical for robots to make decisions. In this paper, we introduce a novel Robotic Multimodal Perception-Planning (RoboMP$^2$) framework for robotic manipulation which consists of a Goal-Conditioned Multimodal Preceptor (GCMP) and a Retrieval-Augmented Multimodal Planner (RAMP). Specially, GCMP captures environment states by employing a tailored MLLMs for embodied agents with the abilities of semantic reasoning and localization. RAMP utilizes coarse-to-fine retrieval method to find the $k$ most-relevant policies as in-context demonstrations to enhance the planner. Extensive experiments demonstrate the superiority of RoboMP$^2$ on both VIMA benchmark and real-world tasks, with around 10% improvement over the baselines.
Abrupt maneuvers by surrounding vehicles (SVs) can typically lead to safety concerns and affect the task efficiency of the ego vehicle (EV), especially with model uncertainties stemming from environmental disturbances. This paper presents a real-time fail-operational controller that ensures the asymptotic convergence of an uncertain EV to a safe state, while preserving task efficiency in dynamic environments. An incremental Bayesian learning approach is developed to facilitate online learning and inference of changing environmental disturbances. Leveraging disturbance quantification and constraint transformation, we develop a stochastic fail-operational barrier based on the control barrier function (CBF). With this development, the uncertain EV is able to converge asymptotically from an unsafe state to a defined safe state with probabilistic stability. Subsequently, the stochastic fail-operational barrier is integrated into an efficient fail-operational controller based on quadratic programming (QP). This controller is tailored for the EV operating under control constraints in the presence of environmental disturbances, with both safety and efficiency objectives taken into consideration. We validate the proposed framework in connected cruise control (CCC) tasks, where SVs perform aggressive driving maneuvers. The simulation results demonstrate that our method empowers the EV to swiftly return to a safe state while upholding task efficiency in real time, even under time-varying environmental disturbances.
Enforcing safety while preventing overly conservative behaviors is essential for autonomous vehicles to achieve high task performance. In this paper, we propose a barrier-enhanced homotopic parallel trajectory optimization (BHPTO) approach with over-relaxed alternating direction method of multipliers (ADMM) for real-time integrated decision-making and planning. To facilitate safety interactions between the ego vehicle (EV) and surrounding vehicles, a spatiotemporal safety module exhibiting bi-convexity is developed on the basis of barrier function. Varying barrier coefficients are adopted for different time steps in a planning horizon to account for the motion uncertainties of surrounding HVs and mitigate conservative behaviors. Additionally, we exploit the discrete characteristics of driving maneuvers to initialize nominal behavior-oriented free-end homotopic trajectories based on reachability analysis, and each trajectory is locally constrained to a specific driving maneuver while sharing the same task objectives. By leveraging the bi-convexity of the safety module and the kinematics of the EV, we formulate the BHPTO as a bi-convex optimization problem. Then constraint transcription and over-relaxed ADMM are employed to streamline the optimization process, such that multiple trajectories are generated in real time with feasibility guarantees. Through a series of experiments, the proposed development demonstrates improved task accuracy, stability, and consistency in various traffic scenarios using synthetic and real-world traffic datasets.
As automation technologies advance, the need for compact and multi-modal sensors in robotic applications is growing. To address this demand, we introduce CompdVision, a novel sensor that combines near-field 3D visual and tactile sensing. This sensor, with dimensions of 22$\times$14$\times$14 mm, leverages the compound eye imaging system to achieve a compact form factor without compromising its dual modalities. CompdVision utilizes two types of vision units to meet diverse sensing requirements. Stereo units with far-focus lenses can see through the transparent elastomer, facilitating depth estimation beyond the contact surface, while tactile units with near-focus lenses track the movement of markers embedded in the elastomer to obtain contact deformation. Experimental results validate the sensor's superior performance in 3D visual and tactile sensing. The sensor demonstrates effective depth estimation within a 70mm range from its surface. Additionally, it registers high accuracy in tangential and normal force measurements. The dual modalities and compact design make the sensor a versatile tool for complex robotic tasks.
Multiple robot systems are favored for object manipulation and transportation, especially for large objects. However, in more complex manipulation such as flipping, these systems encounter a new challenge, configuration disconnectivity of manipulators. Grasping objects by manipulators will impose closed-chain constraints on the system, which in turn limits the feasible motions of manipulators and further compromises the configuration connectivity. Multiple mobile manipulator systems show much more flexibility in object manipulation with the mobility of the mobile platform and have the potential to address the above problem. In this paper, a novel planning framework is proposed for complex flipping manipulation by incorporating platform motions and regrasping. Firstly, two types of trajectories, mobile manipulator planning and regrasping planning, are classified and can be assigned different priorities for different tasks. Secondly, corresponding planning methods are designed for each type of trajectory. Specifically, in mobile manipulator planning, the configuration of the platform is determined through optimization to ensure connectivity when the manipulator approaches configuration boundaries. In regrasping planning, closed-chain constraints are temporarily disregarded and the manipulation capabilities are prioritized to facilitate subsequent planning. Finally, the structure of the overall planning framework is provided. Experimental results demonstrate that the proposed planner efficiently plans the motions of the system to accomplish flipping manipulation. Additionally, a comprehensive experiment emphasizes the significance of our planner in extending the capabilities of multiple mobile manipulator systems in complex tasks.
Origami offers a promising alternative for designing innovative soft robotic actuators. While features of origami, such as bi-directional motion and structural anisotropy, haven't been extensively explored in the past, this letter presents a novel design inspired by origami tubes for a bi-directional actuator. This actuator is capable of moving in two orthogonal directions and has separate channels throughout its body to control each movement. We introduce a bottom-up design methodology that can also be adapted for other complex movements. The actuator was manufactured using popular 3D printing techniques. To enhance its durability, we experimented with different 3D printing technologies and materials. The actuator's strength was further improved using silicon spin coating, and we compared the performance of coated, uncoated, and silicon-only specimens. The material model was empirically derived by testing specimens on a universal testing machine (UTM). Lastly, we suggest potential applications for these actuators, such as in quadruped robots.
Multi-modal behaviors exhibited by surrounding vehicles (SVs) can typically lead to traffic congestion and reduce the travel efficiency of autonomous vehicles (AVs) in dense traffic. This paper proposes a real-time parallel trajectory optimization method for the AV to achieve high travel efficiency in dynamic and congested environments. A spatiotemporal safety module is developed to facilitate the safe interaction between the AV and SVs in the presence of trajectory prediction errors resulting from the multi-modal behaviors of the SVs. By leveraging multiple shooting and constraint transcription, we transform the trajectory optimization problem into a nonlinear programming problem, which allows for the use of optimization solvers and parallel computing techniques to generate multiple feasible trajectories in parallel. Subsequently, these spatiotemporal trajectories are fed into a multi-objective evaluation module considering both safety and efficiency objectives, such that the optimal feasible trajectory corresponding to the optimal target lane can be selected. The proposed framework is validated through simulations in a dense and congested driving scenario with multiple uncertain SVs. The results demonstrate that our method enables the AV to safely navigate through a dense and congested traffic scenario while achieving high travel efficiency and task accuracy in real time.
In dense traffic scenarios, ensuring safety while keeping high task performance for autonomous driving is a critical challenge. To address this problem, this paper proposes a computationally-efficient spatiotemporal receding horizon control (ST-RHC) scheme to generate a safe, dynamically feasible, energy-efficient trajectory in control space, where different driving tasks in dense traffic can be achieved with high accuracy and safety in real time. In particular, an embodied spatiotemporal safety barrier module considering proactive interactions is devised to mitigate the effects of inaccuracies resulting from the trajectory prediction of other vehicles. Subsequently, the motion planning and control problem is formulated as a constrained nonlinear optimization problem, which favorably facilitates the effective use of off-the-shelf optimization solvers in conjunction with multiple shooting. The effectiveness of the proposed ST-RHC scheme is demonstrated through comprehensive comparisons with state-of-the-art algorithms on synthetic and real-world traffic datasets under dense traffic, and the attendant outcome of superior performance in terms of accuracy, efficiency and safety is achieved.