We present a novel approach that aims to address both safety and stability of a haptic teleoperation system within a framework of Haptic Shared Autonomy (HSA). We use Control Barrier Functions (CBFs) to generate the control input that follows the user's input as closely as possible while guaranteeing safety. In the context of stability of the human-in-the-loop system, we limit the force feedback perceived by the user via a small $L_2$-gain, which is achieved by limiting the control and the force feedback via a differential constraint. Specifically, with the property of HSA, we propose two pathways to design the control and the force feedback: Sequential Control Force (SCF) and Joint Control Force (JCF). Both designs can achieve safety and stability but with different responses to the user's commands. We conducted experimental simulations to evaluate and investigate the properties of the designed methods. We also tested the proposed method on a physical quadrotor UAV and a haptic interface.
Using machine learning (ML) techniques to predict material properties is a crucial research topic. These properties depend on numerical data and semantic factors. Due to the limitations of small-sample datasets, existing methods typically adopt ML algorithms to regress numerical properties or transfer other pre-trained knowledge graphs (KGs) to the material. However, these methods cannot simultaneously handle semantic and numerical information. In this paper, we propose a numerical reasoning method for material KGs (NR-KG), which constructs a cross-modal KG using semantic nodes and numerical proxy nodes. It captures both types of information by projecting KG into a canonical KG and utilizes a graph neural network to predict material properties. In this process, a novel projection prediction loss is proposed to extract semantic features from numerical information. NR-KG facilitates end-to-end processing of cross-modal data, mining relationships and cross-modal information in small-sample datasets, and fully utilizes valuable experimental data to enhance material prediction. We further propose two new High-Entropy Alloys (HEA) property datasets with semantic descriptions. NR-KG outperforms state-of-the-art (SOTA) methods, achieving relative improvements of 25.9% and 16.1% on two material datasets. Besides, NR-KG surpasses SOTA methods on two public physical chemistry molecular datasets, showing improvements of 22.2% and 54.3%, highlighting its potential application and generalizability. We hope the proposed datasets, algorithms, and pre-trained models can facilitate the communities of KG and AI for materials.
Temporal knowledge prediction is a crucial task for the event early warning that has gained increasing attention in recent years, which aims to predict the future facts by using relevant historical facts on the temporal knowledge graphs. There are two main difficulties in this prediction task. First, from the historical facts point of view, how to model the evolutionary patterns of the facts to predict the query accurately. Second, from the query perspective, how to handle the two cases where the query contains seen and unseen entities in a unified framework. Driven by the two problems, we propose a novel adaptive pseudo-siamese policy network for temporal knowledge prediction based on reinforcement learning. Specifically, we design the policy network in our model as a pseudo-siamese policy network that consists of two sub-policy networks. In sub-policy network I, the agent searches for the answer for the query along the entity-relation paths to capture the static evolutionary patterns. And in sub-policy network II, the agent searches for the answer for the query along the relation-time paths to deal with unseen entities. Moreover, we develop a temporal relation encoder to capture the temporal evolutionary patterns. Finally, we design a gating mechanism to adaptively integrate the results of the two sub-policy networks to help the agent focus on the destination answer. To assess our model performance, we conduct link prediction on four benchmark datasets, the experimental results demonstrate that our method obtains considerable performance compared with existing methods.
Knowledge graph embedding~(KGE) aims to represent entities and relations into low-dimensional vectors for many real-world applications. The representations of entities and relations are learned via contrasting the positive and negative triplets. Thus, high-quality negative samples are extremely important in KGE. However, the present KGE models either rely on simple negative sampling methods, which makes it difficult to obtain informative negative triplets; or employ complex adversarial methods, which requires more training data and strategies. In addition, these methods can only construct negative triplets using the existing entities, which limits the potential to explore harder negative triplets. To address these issues, we adopt mixing operation in generating harder negative samples for knowledge graphs and introduce an inexpensive but effective method called MixKG. Technically, MixKG first proposes two kinds of criteria to filter hard negative triplets among the sampled negatives: based on scoring function and based on correct entity similarity. Then, MixKG synthesizes harder negative samples via the convex combinations of the paired selected hard negatives. Experiments on two public datasets and four classical KGE methods show MixKG is superior to previous negative sampling algorithms.
Knowledge Graphs (KGs) have shown great success in recommendation. This is attributed to the rich attribute information contained in KG to improve item and user representations as side information. However, existing knowledge-aware methods leverage attribute information at a coarse-grained level both in item and user side. In this paper, we proposed a novel attentive knowledge graph attribute network(AKGAN) to learn item attributes and user interests via attribute information in KG. Technically, AKGAN adopts a heterogeneous graph neural network framework, which has a different design between the first layer and the latter layer. With one attribute placed in the corresponding range of element-wise positions, AKGAN employs a novel interest-aware attention network, which releases the limitation that the sum of attention weight is 1, to model the complexity and personality of user interests towards attributes. Experimental results on three benchmark datasets show the effectiveness and explainability of AKGAN.
Recently, x-vector has been a successful and popular approach for speaker verification, which employs a time delay neural network (TDNN) and statistics pooling to extract speaker characterizing embedding from variable-length utterances. Improvement upon the x-vector has been an active research area, and enormous neural networks have been elaborately designed based on the x-vector, eg, extended TDNN (E-TDNN), factorized TDNN (F-TDNN), and densely connected TDNN (D-TDNN). In this work, we try to identify the optimal architectures from a TDNN based search space employing neural architecture search (NAS), named SpeechNAS. Leveraging the recent advances in the speaker recognition, such as high-order statistics pooling, multi-branch mechanism, D-TDNN and angular additive margin softmax (AAM) loss with a minimum hyper-spherical energy (MHE), SpeechNAS automatically discovers five network architectures, from SpeechNAS-1 to SpeechNAS-5, of various numbers of parameters and GFLOPs on the large-scale text-independent speaker recognition dataset VoxCeleb1. Our derived best neural network achieves an equal error rate (EER) of 1.02% on the standard test set of VoxCeleb1, which surpasses previous TDNN based state-of-the-art approaches by a large margin. Code and trained weights are in https://github.com/wentaozhu/speechnas.git
Time Delay Neural Networks (TDNN)-based methods are widely used in dialect identification. However, in previous work with TDNN application, subtle variant is being neglected in different feature scales. To address this issue, we propose a new architecture, named dynamic multi-scale convolution, which consists of dynamic kernel convolution, local multi-scale learning, and global multi-scale pooling. Dynamic kernel convolution captures features between short-term and long-term context adaptively. Local multi-scale learning, which represents multi-scale features at a granular level, is able to increase the range of receptive fields for convolution operation. Besides, global multi-scale pooling is applied to aggregate features from different bottleneck layers in order to collect information from multiple aspects. The proposed architecture significantly outperforms state-of-the-art system on the AP20-OLR-dialect-task of oriental language recognition (OLR) challenge 2020, with the best average cost performance (Cavg) of 0.067 and the best equal error rate (EER) of 6.52%. Compared with the known best results, our method achieves 9% of Cavg and 45% of EER relative improvement, respectively. Furthermore, the parameters of proposed model are 91% fewer than the best known model.
Graph representation learning has attracted a surge of interest recently, whose target at learning discriminant embedding for each node in the graph. Most of these representation methods focus on supervised learning and heavily depend on label information. However, annotating graphs are expensive to obtain in the real world, especially in specialized domains (i.e. biology), as it needs the annotator to have the domain knowledge to label the graph. To approach this problem, self-supervised learning provides a feasible solution for graph representation learning. In this paper, we propose a Multi-Level Graph Contrastive Learning (MLGCL) framework for learning robust representation of graph data by contrasting space views of graphs. Specifically, we introduce a novel contrastive view - topological and feature space views. The original graph is first-order approximation structure and contains uncertainty or error, while the $k$NN graph generated by encoding features preserves high-order proximity. Thus $k$NN graph generated by encoding features not only provide a complementary view, but is more suitable to GNN encoder to extract discriminant representation. Furthermore, we develop a multi-level contrastive mode to preserve the local similarity and semantic similarity of graph-structured data simultaneously. Extensive experiments indicate MLGCL achieves promising results compared with the existing state-of-the-art graph representation learning methods on seven datasets.
We present a novel haptic teleoperation approach that considers not only the safety but also the stability of a teleoperation system. Specifically, we build upon previous work on haptic shared control, which uses control barrier functions (CBFs) to generate a reference haptic feedback that informs the human operator on the internal state of the system, helping them to safely navigate the robot without taking away their control authority. Crucially, in this approach the force rendered to the user is not directly reflected in the motion of the robot (which is still directly controlled by the user); however, previous work in the area neglected to consider the feedback loop through the user, possibly resulting in unstable closed trajectories. In this paper we introduce a differential constraint on the rendered force that makes the system finite-gain $L_2$ stable; the constraint results in a Quadratically Constrained Quadratic Program (QCQP), for which we provide a closed-form solution. Our constraint is related to but less restrictive than the typical passivity constraint used in previous literature. We conducted an experimental simulation in which a human operator flies a UAV near an obstacle to evaluate the proposed method.
Shared autonomy teleoperation can guarantee safety, but does so by reducing the human operator's control authority, which can lead to reduced levels of human-robot agreement and user satisfaction. This paper presents a novel haptic shared autonomy teleoperation paradigm that uses haptic feedback to inform the user about the inner state of a shared autonomy paradigm, while still guaranteeing safety. This differs from haptic shared control, which uses haptic feedback to inform the user's actions, but gives the human operator full control over the robot's actions. We conducted a user study in which twelve users flew a simulated UAV in a search-and-rescue task with no assistance or assistance provided by haptic shared control, shared autonomy, or haptic shared autonomy. All assistive teleoperation methods use control barrier functions to find a control command that is both safe and as close as possible to the human-generated control command. For assistive teleoperation conditions with haptic feedback, we apply a force to the user that is proportional to the difference between the human-generated control and the safe control. We find that haptic shared autonomy improves the user's task performance and satisfaction. We also find that haptic feedback in assistive teleoperation can improve the user's situational awareness. Finally, results show that adding haptic feedback to shared-autonomy teleoperation can improve human-robot agreement.