To assist robots in teleoperation tasks, haptic rendering which allows human operators access a virtual touch feeling has been developed in recent years. Most previous haptic rendering methods strongly rely on data collected by tactile sensors. However, tactile data is not widely available for robots due to their limited reachable space and the restrictions of tactile sensors. To eliminate the need for tactile data, in this paper we propose a novel method named as Vis2Hap to generate haptic rendering from visual inputs that can be obtained from a distance without physical interaction. We take the surface texture of objects as key cues to be conveyed to the human operator. To this end, a generative model is designed to simulate the roughness and slipperiness of the object's surface. To embed haptic cues in Vis2Hap, we use height maps from tactile sensors and spectrograms from friction coefficients as the intermediate outputs of the generative model. Once Vis2Hap is trained, it can be used to generate height maps and spectrograms of new surface textures, from which a friction image can be obtained and displayed on a haptic display. The user study demonstrates that our proposed Vis2Hap method enables users to access a realistic haptic feeling similar to that of physical objects. The proposed vision-based haptic rendering has the potential to enhance human operators' perception of the remote environment and facilitate robotic manipulation.
The CTC model has been widely applied to many application scenarios because of its simple structure, excellent performance, and fast inference speed. There are many peaks in the probability distribution predicted by the CTC models, and each peak represents a non-blank token. The recognition latency of CTC models can be reduced by encouraging the model to predict peaks earlier. Existing methods to reduce latency require modifying the transition relationship between tokens in the forward-backward algorithm, and the gradient calculation. Some of these methods even depend on the forced alignment results provided by other pretrained models. The above methods are complex to implement. To reduce the peak latency, we propose a simple and novel method named peak-first regularization, which utilizes a frame-wise knowledge distillation function to force the probability distribution of the CTC model to shift left along the time axis instead of directly modifying the calculation process of CTC loss and gradients. All the experiments are conducted on a Chinese Mandarin dataset AISHELL-1. We have verified the effectiveness of the proposed regularization on both streaming and non-streaming CTC models respectively. The results show that the proposed method can reduce the average peak latency by about 100 to 200 milliseconds with almost no degradation of recognition accuracy.
A fundamental challenge in deep metric learning is the generalization capability of the feature embedding network model since the embedding network learned on training classes need to be evaluated on new test classes. To address this challenge, in this paper, we introduce a new method called coded residual transform (CRT) for deep metric learning to significantly improve its generalization capability. Specifically, we learn a set of diversified prototype features, project the feature map onto each prototype, and then encode its features using their projection residuals weighted by their correlation coefficients with each prototype. The proposed CRT method has the following two unique characteristics. First, it represents and encodes the feature map from a set of complimentary perspectives based on projections onto diversified prototypes. Second, unlike existing transformer-based feature representation approaches which encode the original values of features based on global correlation analysis, the proposed coded residual transform encodes the relative differences between the original features and their projected prototypes. Embedding space density and spectral decay analysis show that this multi-perspective projection onto diversified prototypes and coded residual representation are able to achieve significantly improved generalization capability in metric learning. Finally, to further enhance the generalization performance, we propose to enforce the consistency on their feature similarity matrices between coded residual transforms with different sizes of projection prototypes and embedding dimensions. Our extensive experimental results and ablation studies demonstrate that the proposed CRT method outperform the state-of-the-art deep metric learning methods by large margins and improving upon the current best method by up to 4.28% on the CUB dataset.
We present an algorithmic contribution to improve the efficiency of robust trim-fitting in outlier affected geometric regression problems. The method heavily relies on the quick sort algorithm, and we present two important insights. First, partial sorting is sufficient for the incremental calculation of the x-th percentile value. Second, the normal equations in linear fitting problems may be updated incrementally by logging swap operations across the x-th percentile boundary during sorting. Besides linear fitting problems, we demonstrate how the technique can be additionally applied to closed-form, non-linear energy minimization problems, thus enabling efficient trim fitting under geometrically optimal objectives. We apply our method to two distinct camera resectioning algorithms, and demonstrate highly efficient and reliable, geometric trim fitting.
In this paper, we propose SATformer, a novel Transformer-based solution for Boolean satisfiability (SAT) solving. Different from existing learning-based SAT solvers that learn at the problem instance level, SATformer learns the minimum unsatisfiable cores (MUC) of unsatisfiable problem instances, which provide rich information for the causality of such problems. Specifically, we apply a graph neural network (GNN) to obtain the embeddings of the clauses in the conjunctive normal format (CNF). A hierarchical Transformer architecture is applied on the clause embeddings to capture the relationships among clauses, and the self-attention weight is learned to be high when those clauses forming UNSAT cores are attended together, and set to be low otherwise. By doing so, SATformer effectively learns the correlations among clauses for SAT prediction. Experimental results show that SATformer is more powerful than existing end-to-end learning-based SAT solvers.
Human leukocyte antigen (HLA) is an important molecule family in the field of human immunity, which recognizes foreign threats and triggers immune responses by presenting peptides to T cells. In recent years, the synthesis of tumor vaccines to induce specific immune responses has become the forefront of cancer treatment. Computationally modeling the binding patterns between peptide and HLA can greatly accelerate the development of tumor vaccines. However, most of the prediction methods performance is very limited and they cannot fully take advantage of the analysis of existing biological knowledge as the basis of modeling. In this paper, we propose TripHLApan, a novel pan-specific prediction model, for HLA molecular peptide binding prediction. TripHLApan exhibits powerful prediction ability by integrating triple coding matrix, BiGRU + Attention models, and transfer learning strategy. The comprehensive evaluations demonstrate the effectiveness of TripHLApan in predicting HLA-I and HLA-II peptide binding in different test environments. The predictive power of HLA-I is further demonstrated in the latest data set. In addition, we show that TripHLApan has strong binding reconstitution ability in the samples of a melanoma patient. In conclusion, TripHLApan is a powerful tool for predicting the binding of HLA-I and HLA-II molecular peptides for the synthesis of tumor vaccines.
The rational design of novel molecules with desired bioactivity is a critical but challenging task in drug discovery, especially when treating a novel target family or understudied targets. Here, we propose PGMG, a pharmacophore-guided deep learning approach for bioactivate molecule generation. Through the guidance of pharmacophore, PGMG provides a flexible strategy to generate bioactive molecules with structural diversity in various scenarios using a trained variational autoencoder. We show that PGMG can generate molecules matching given pharmacophore models while maintaining a high level of validity, uniqueness, and novelty. In the case studies, we demonstrate the application of PGMG to generate bioactive molecules in ligand-based and structure-based drug de novo design, as well as in lead optimization scenarios. Overall, the flexibility and effectiveness of PGMG make it a useful tool for accelerating the drug discovery process.
Test point insertion (TPI) is a widely used technique for testability enhancement, especially for logic built-in self-test (LBIST) due to its relatively low fault coverage. In this paper, we propose a novel TPI approach based on deep reinforcement learning (DRL), named DeepTPI. Unlike previous learning-based solutions that formulate the TPI task as a supervised-learning problem, we train a novel DRL agent, instantiated as the combination of a graph neural network (GNN) and a Deep Q-Learning network (DQN), to maximize the test coverage improvement. Specifically, we model circuits as directed graphs and design a graph-based value network to estimate the action values for inserting different test points. The policy of the DRL agent is defined as selecting the action with the maximum value. Moreover, we apply the general node embeddings from a pre-trained model to enhance node features, and propose a dedicated testability-aware attention mechanism for the value network. Experimental results on circuits with various scales show that DeepTPI significantly improves test coverage compared to the commercial DFT tool. The code of this work is available at https://github.com/cure-lab/DeepTPI.
We present DeepSAT, a novel end-to-end learning framework for the Boolean satisfiability (SAT) problem. Unlike existing solutions trained on random SAT instances with relatively weak supervisions, we propose applying the knowledge of the well-developed electronic design automation (EDA) field for SAT solving. Specifically, we first resort to advanced logic synthesis algorithms to pre-process SAT instances into optimized and-inverter graphs (AIGs). By doing so, our training and test sets have a unified distribution, thus the learned model can generalize well to test sets of various sources of SAT instances. Next, we regard the distribution of SAT solutions being a product of conditional Bernoulli distributions. Based on this observation, we approximate the SAT solving procedure with a conditional generative model, leveraging a directed acyclic graph neural network with two polarity prototypes for conditional SAT modeling. To effectively train the generative model, with the help of logic simulation tools, we obtain the probabilities of nodes in the AIG being logic '1' as rich supervision. We conduct extensive experiments on various SAT instances. DeepSAT achieves significant accuracy improvements over state-of-the-art learning-based SAT solutions, especially when generalized to SAT instances that are large or with diverse distributions.
Kernel smooth is the most fundamental technique for data density and regression estimation. However, time-consuming is the biggest obstacle for the application that the direct evaluation of kernel smooth for $N$ samples needs ${O}\left( {{N}^{2}} \right)$ operations. People have developed fast smooth algorithms using the idea of binning with FFT. Unfortunately, the accuracy is not controllable, and the implementation for multivariable and its bandwidth selection for the fast method is not available. Hence, we introduce a new MATLAB toolbox for fast multivariate kernel regression with the idea of non-uniform FFT (NUFFT), which implemented the algorithm for $M$ gridding points with ${O}\left( N+M\log M \right)$ complexity and accuracy controllability. The bandwidth selection problem utilizes the Fast Monte-Carlo algorithm to estimate the degree of freedom (DF), saving enormous cross-validation time even better when data share the same grid space for multiple regression. Up to now, this is the first toolbox for fast-binning high-dimensional kernel regression. Moreover, the estimation for local polynomial regression, the conditional variance for the heteroscedastic model, and the complex-valued datasets are also implemented in this toolbox. The performance is demonstrated with simulations and an application on the quantitive EEG.