Hybrid model predictive control with both continuous and discrete variables is widely applicable to robotic control tasks, especially those involving contact with the environment. Due to the combinatorial complexity, the solving speed of hybrid MPC can be insufficient for real-time applications. In this paper, we proposed a hybrid MPC solver based on Generalized Benders Decomposition (GBD). The algorithm enumerates and stores cutting planes online inside a finite buffer. After a short cold-start phase, the stored cuts provide warm-starts for the new problem instances to enhance the solving speed. Despite the disturbance and randomly changing environment, the solving speed maintains. Leveraging on the sparsity of feasibility cuts, we also propose a fast algorithm for Benders master problems. Our solver is validated through controlling a cart-pole system with randomly moving soft contact walls, and a free-flying robot navigating around obstacles. The results show that with significantly less data than previous works, the solver reaches competitive speeds to the off-the-shelf solver Gurobi despite the Python overhead.
Link prediction in biomedical knowledge graphs (KGs) aims at predicting unknown interactions between entities, including drug-target interaction (DTI) and drug-drug interaction (DDI), which is critical for drug discovery and therapeutics. Previous methods prefer to utilize the rich semantic relations and topological structure of the KG to predict missing links, yielding promising outcomes. However, all these works only focus on improving the predictive performance without considering the inevitable noise and unreliable interactions existing in the KGs, which limits the development of KG-based computational methods. To address these limitations, we propose a Denoised Link Prediction framework, called DenoisedLP. DenoisedLP obtains reliable interactions based on the local subgraph by denoising noisy links in a learnable way, providing a universal module for mining underlying task-relevant relations. To collaborate with the smoothed semantic information, DenoisedLP introduces the semantic subgraph by blurring conflict relations around the predicted link. By maximizing the mutual information between the reliable structure and smoothed semantic relations, DenoisedLP emphasizes the informative interactions for predicting relation-specific links. Experimental results on real-world datasets demonstrate that DenoisedLP outperforms state-of-the-art methods on DTI and DDI prediction tasks, and verify the effectiveness and robustness of denoising unreliable interactions on the contaminated KGs.
This paper presents SCALER, a versatile free-climbing multi-limbed robot that is designed to achieve tightly coupled simultaneous locomotion and dexterous grasping. Although existing quadruped-limbed robots have shown impressive dexterous skills such as object manipulation, it is essential to balance power-intensive locomotion and dexterous grasping capabilities. We design a torso linkage and a parallel-serial limb to meet such conflicting skills that pose unique challenges in the hardware designs. SCALER employs underactuated two-fingered GOAT grippers that can mechanically adapt and offer 7 modes of grasping, enabling SCALER to traverse extreme terrains with multi-modal grasping strategies. We study the whole-body approach, where SCALER uses its body and limbs to generate additional forces for stable grasping with environments, further enhancing versatility. Furthermore, we improve the GOAT gripper actuation speed to realize more dynamic climbing in a closed-loop control fashion. With these proposed technologies, SCALER can traverse vertical, overhang, upside-down, slippery terrains, and bouldering walls with non-convex-shaped climbing holds under the Earth's gravity.
Hybrid model predictive control (MPC) with both continuous and discrete variables is widely applicable to robotic control tasks, especially those involving contact with the environment. Due to the combinatorial complexity, the solving speed of hybrid MPC can be insufficient for real-time applications. In this paper, we proposed a hybrid MPC solver based on Generalized Benders Decomposition (GBD) with continual learning. The algorithm accumulates cutting planes from the invariant dual space of the subproblems. After a short cold-start phase, the accumulated cuts provide warm-starts for the new problem instances to increase the solving speed. Despite the randomly changing environment that the control is unprepared for, the solving speed maintains. We verified our solver on controlling a cart-pole system with randomly moving soft contact walls and show that the solving speed is 2-3 times faster than the off-the-shelf solver Gurobi.
Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, NLP based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely-used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.
Text-to-speech(TTS) has undergone remarkable improvements in performance, particularly with the advent of Denoising Diffusion Probabilistic Models (DDPMs). However, the perceived quality of audio depends not solely on its content, pitch, rhythm, and energy, but also on the physical environment. In this work, we propose ViT-TTS, the first visual TTS model with scalable diffusion transformers. ViT-TTS complement the phoneme sequence with the visual information to generate high-perceived audio, opening up new avenues for practical applications of AR and VR to allow a more immersive and realistic audio experience. To mitigate the data scarcity in learning visual acoustic information, we 1) introduce a self-supervised learning framework to enhance both the visual-text encoder and denoiser decoder; 2) leverage the diffusion transformer scalable in terms of parameters and capacity to learn visual scene information. Experimental results demonstrate that ViT-TTS achieves new state-of-the-art results, outperforming cascaded systems and other baselines regardless of the visibility of the scene. With low-resource data (1h, 2h, 5h), ViT-TTS achieves comparative results with rich-resource baselines.~\footnote{Audio samples are available at \url{https://ViT-TTS.github.io/.}}
Mixed integer convex and nonlinear programs, MICP and MINLP, are expressive but require long solving times. Recent work that combines data-driven methods on solver heuristics has shown potential to overcome this issue allowing for applications on larger scale practical problems. To solve mixed-integer bilinear programs online with data-driven methods, several formulations exist including mathematical programming with complementary constraints (MPCC), mixed-integer programming (MIP). In this work, we benchmark the performances of those data-driven schemes on a bookshelf organization problem that has discrete mode switch and collision avoidance constraints. The success rate, optimal cost and solving time are compared along with non-data-driven methods. Our proposed methods are demonstrated as a high level planner for a robotic arm for the bookshelf problem.
The task of argument mining aims to detect all possible argumentative components and identify their relationships automatically. As a thriving field in natural language processing, there has been a large amount of corpus for academic study and application development in argument mining. However, the research in this area is still constrained by the inherent limitations of existing datasets. Specifically, all the publicly available datasets are relatively small in scale, and few of them provide information from other modalities to facilitate the learning process. Moreover, the statements and expressions in these corpora are usually in a compact form, which means non-adjacent clauses or text segments will always be regarded as multiple individual components, thus restricting the generalization ability of models. To this end, we collect and contribute a novel dataset AntCritic to serve as a helpful complement to this area, which consists of about 10k free-form and visually-rich financial comments and supports both argument component detection and argument relation prediction tasks. Besides, in order to cope with the challenges and difficulties brought by scenario expansion and problem setting modification, we thoroughly explore the fine-grained relation prediction and structure reconstruction scheme for free-form documents and discuss the encoding mechanism for visual styles and layouts. And based on these analyses, we design two simple but effective model architectures and conduct various experiments on this dataset to provide benchmark performances as a reference and verify the practicability of our proposed architecture.
In this paper we present a motion planner for LIMMS, a modular multi-agent, multi-modal package delivery platform. A single LIMMS unit is a robot that can operate as an arm or leg depending on how and what it is attached to, e.g., a manipulator when it is anchored to walls within a delivery vehicle or a quadruped robot when 4 are attached to a box. Coordinating amongst multiple LIMMS, when each one can take on vastly different roles, can quickly become complex. For such a planning problem we first compose the necessary logic and constraints. The formulation is then solved for skill exploration and can be implemented on hardware after refinement. To solve this optimization problem we use alternating direction method of multipliers (ADMM). The proposed planner is experimented under various scenarios which shows the capability of LIMMS to enter into different modes or combinations of them to achieve their goal of moving shipping boxes.
Convex model predictive controls (MPCs) with a single rigid body model have demonstrated strong performance on real legged robots. However, convex MPCs are limited by their assumptions such as small rotation angle and pre-defined gait, limiting the richness of potential solutions. We remove those assumptions and solve the complete mixed-integer non-convex programming with single rigid body model. We first collect datasets of pre-solved problems offline, then learn the problem-solution map to solve this optimization fast for MPC. If warm-starts can be found, offline problems can be solved close to the global optimality. The proposed controller is tested by generating various gaits and behaviors depending on the initial conditions. Hardware test demonstrates online gait generation and adaptation running at more than 50 Hz based on sensor feedback.