Abstract:Humanoid robots are designed to perform diverse loco-manipulation tasks. However, they face challenges due to their high-dimensional and unstable dynamics, as well as the complex contact-rich nature of the tasks. Model-based optimal control methods offer precise and systematic control but are limited by high computational complexity and accurate contact sensing. On the other hand, reinforcement learning (RL) provides robustness and handles high-dimensional spaces but suffers from inefficient learning, unnatural motion, and sim-to-real gaps. To address these challenges, we introduce Opt2Skill, an end-to-end pipeline that combines model-based trajectory optimization with RL to achieve robust whole-body loco-manipulation. We generate reference motions for the Digit humanoid robot using differential dynamic programming (DDP) and train RL policies to track these trajectories. Our results demonstrate that Opt2Skill outperforms pure RL methods in both training efficiency and task performance, with optimal trajectories that account for torque limits enhancing trajectory tracking. We successfully transfer our approach to real-world applications.
Abstract:The 2023 Big ANN Challenge, held at NeurIPS 2023, focused on advancing the state-of-the-art in indexing data structures and search algorithms for practical variants of Approximate Nearest Neighbor (ANN) search that reflect the growing complexity and diversity of workloads. Unlike prior challenges that emphasized scaling up classical ANN search ~\cite{DBLP:conf/nips/SimhadriWADBBCH21}, this competition addressed filtered search, out-of-distribution data, sparse and streaming variants of ANNS. Participants developed and submitted innovative solutions that were evaluated on new standard datasets with constrained computational resources. The results showcased significant improvements in search accuracy and efficiency over industry-standard baselines, with notable contributions from both academic and industrial teams. This paper summarizes the competition tracks, datasets, evaluation metrics, and the innovative approaches of the top-performing submissions, providing insights into the current advancements and future directions in the field of approximate nearest neighbor search.
Abstract:Humans perceive and interact with the world with the awareness of equivariance, facilitating us in manipulating different objects in diverse poses. For robotic manipulation, such equivariance also exists in many scenarios. For example, no matter what the pose of a drawer is (translation, rotation and tilt), the manipulation strategy is consistent (grasp the handle and pull in a line). While traditional models usually do not have the awareness of equivariance for robotic manipulation, which might result in more data for training and poor performance in novel object poses, we propose our EqvAfford framework, with novel designs to guarantee the equivariance in point-level affordance learning for downstream robotic manipulation, with great performance and generalization ability on representative tasks on objects in diverse poses.
Abstract:Offline reinforcement learning (RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities, diffusion models have shown significant potential as policy models for offline RL issues. However, previous offline RL algorithms based on diffusion policies generally adopt weighted regression to improve the policy. This approach optimizes the policy only using the collected actions and is sensitive to Q-values, which limits the potential for further performance enhancement. To this end, we propose a novel preferred-action-optimized diffusion policy for offline RL. In particular, an expressive conditional diffusion model is utilized to represent the diverse distribution of a behavior policy. Meanwhile, based on the diffusion model, preferred actions within the same behavior distribution are automatically generated through the critic function. Moreover, an anti-noise preference optimization is designed to achieve policy improvement by using the preferred actions, which can adapt to noise-preferred actions for stable training. Extensive experiments demonstrate that the proposed method provides competitive or superior performance compared to previous state-of-the-art offline RL methods, particularly in sparse reward tasks such as Kitchen and AntMaze. Additionally, we empirically prove the effectiveness of anti-noise preference optimization.
Abstract:As Embodied AI advances, it increasingly enables robots to handle the complexity of household manipulation tasks more effectively. However, the application of robots in these settings remains limited due to the scarcity of bimanual-mobile robot manipulation datasets. Existing datasets either focus solely on simple grasping tasks using single-arm robots without mobility, or collect sensor data limited to a narrow scope of sensory inputs. As a result, these datasets often fail to encapsulate the intricate and dynamic nature of real-world tasks that bimanual-mobile robots are expected to perform. To address these limitations, we introduce BRMData, a Bimanual-mobile Robot Manipulation Dataset designed specifically for household applications. The dataset includes 10 diverse household tasks, ranging from simple single-arm manipulation to more complex dual-arm and mobile manipulations. It is collected using multi-view and depth-sensing data acquisition strategies. Human-robot interactions and multi-object manipulations are integrated into the task designs to closely simulate real-world household applications. Moreover, we present a Manipulation Efficiency Score (MES) metric to evaluate both the precision and efficiency of robot manipulation methods. BRMData aims to drive the development of versatile robot manipulation technologies, specifically focusing on advancing imitation learning methods from human demonstrations. The dataset is now open-sourced and available at https://embodiedrobot.github.io/, enhancing research and development efforts in the field of Embodied Manipulation.
Abstract:Equipping a conversational search engine with strategies regarding when to ask clarification questions is becoming increasingly important across various domains. Attributing to the context understanding capability of LLMs and their access to domain-specific sources of knowledge, LLM-based clarification strategies feature rapid transfer to various domains in a post-hoc manner. However, they still struggle to deliver promising performance on unseen domains, struggling to achieve effective domain transferability. We take the first step to investigate this issue and existing methods tend to produce one-size-fits-all strategies across diverse domains, limiting their search effectiveness. In response, we introduce a novel method, called Style, to achieve effective domain transferability. Our experimental results indicate that Style bears strong domain transferability, resulting in an average search performance improvement of ~10% on four unseen domains.
Abstract:Temporal Knowledge Graph (TKG) reasoning is based on historical information to predict the future. Therefore, parsing and mining historical information is key to predicting the future. Most existing methods fail to concurrently address and comprehend historical information from both global and local perspectives. Neglecting the global view might result in overlooking macroscopic trends and patterns, while ignoring the local view can lead to missing critical detailed information. Additionally, some methods do not focus on learning from high-frequency repeating events, which means they may not fully grasp frequently occurring historical events. To this end, we propose the \textbf{R}epetitive-\textbf{L}ocal-\textbf{G}lobal History \textbf{Net}work(RLGNet). We utilize a global history encoder to capture the overarching nature of historical information. Subsequently, the local history encoder provides information related to the query timestamp. Finally, we employ the repeating history encoder to identify and learn from frequently occurring historical events. In the evaluation on six benchmark datasets, our approach generally outperforms existing TKG reasoning models in multi-step and single-step reasoning tasks.
Abstract:Traffic prediction constitutes a pivotal facet within the purview of Intelligent Transportation Systems (ITS), and the attainment of highly precise predictions holds profound significance for efficacious traffic management. The precision of prevailing deep learning-driven traffic prediction models typically sees an upward trend with a rise in the volume of training data. However, the procurement of comprehensive spatiotemporal datasets for traffic is often fraught with challenges, primarily stemming from the substantial costs associated with data collection and retention. Consequently, developing a model that can achieve accurate predictions and good generalization ability in areas with limited historical traffic data is a challenging problem. It is noteworthy that the rapidly advancing pretrained Large Language Models (LLMs) of recent years have demonstrated exceptional proficiency in cross-modality knowledge transfer and few-shot learning. Recognizing the sequential nature of traffic data, similar to language, we introduce TPLLM, a novel traffic prediction framework leveraging LLMs. In this framework, we construct a sequence embedding layer based on Convolutional Neural Networks (CNNs) and a graph embedding layer based on Graph Convolutional Networks (GCNs) to extract sequence features and spatial features, respectively. These are subsequently integrated to form inputs that are suitable for LLMs. A Low-Rank Adaptation (LoRA) fine-tuning approach is applied to TPLLM, thereby facilitating efficient learning and minimizing computational demands. Experiments on two real-world datasets demonstrate that TPLLM exhibits commendable performance in both full-sample and few-shot prediction scenarios, effectively supporting the development of ITS in regions with scarce historical traffic data.
Abstract:Historically, the economic recession often came abruptly and disastrously. For instance, during the 2008 financial crisis, the SP 500 fell 46 percent from October 2007 to March 2009. If we could detect the signals of the crisis earlier, we could have taken preventive measures. Therefore, driven by such motivation, we use advanced machine learning techniques, including Random Forest and Extreme Gradient Boosting, to predict any potential market crashes mainly in the US market. Also, we would like to compare the performance of these methods and examine which model is better for forecasting US stock market crashes. We apply our models on the daily financial market data, which tend to be more responsive with higher reporting frequencies. We consider 75 explanatory variables, including general US stock market indexes, SP 500 sector indexes, as well as market indicators that can be used for the purpose of crisis prediction. Finally, we conclude, with selected classification metrics, that the Extreme Gradient Boosting method performs the best in predicting US stock market crisis events.
Abstract:Atrial fibrillation (AF) and ventricular tachycardia (VT) are two of the sustained arrhythmias that significantly affect the quality of life of patients. Treatment of AF and VT often requires radiofrequency ablation of heart tissues using an ablation catheter. Recent progress in ablation therapy leverages magnetic resonance imaging (MRI) for higher contrast visual feedback, and additionally utilizes a guiding sheath with an actively deflectable tip to improve the dexterity of the catheter inside the heart. This paper presents the design and validation of an MR-conditional robotic module for automated actuation of both the ablation catheter and the sheath. The robotic module features a compact design for improved accessibility inside the MR scanner bore and is driven by piezoelectric motors to ensure MR-conditionality. The combined catheter-sheath mechanism is essentially a concentric tendon-driven continuum robot and its kinematics is modeled by the constant curvature model for closed-loop position control. Path following experiments were conducted to validate the actuation module and control scheme, achieving < 2 mm average tip position error.