In the era of big data, data-driven based classification has become an essential method in smart manufacturing to guide production and optimize inspection. The industrial data obtained in practice is usually time-series data collected by soft sensors, which are highly nonlinear, nonstationary, imbalanced, and noisy. Most existing soft-sensing machine learning models focus on capturing either intra-series temporal dependencies or pre-defined inter-series correlations, while ignoring the correlation between labels as each instance is associated with multiple labels simultaneously. In this paper, we propose a novel graph based soft-sensing neural network (GraSSNet) for multivariate time-series classification of noisy and highly-imbalanced soft-sensing data. The proposed GraSSNet is able to 1) capture the inter-series and intra-series dependencies jointly in the spectral domain; 2) exploit the label correlations by superimposing label graph that built from statistical co-occurrence information; 3) learn features with attention mechanism from both textual and numerical domain; and 4) leverage unlabeled data and mitigate data imbalance by semi-supervised learning. Comparative studies with other commonly used classifiers are carried out on Seagate soft sensing data, and the experimental results validate the competitive performance of our proposed method.
In this paper, we present a deep generative model based method to generate diverse human motion interpolation results. We resort to the Conditional Variational Auto-Encoder (CVAE) to learn human motion conditioned on a pair of given start and end motions, by leveraging the Recurrent Neural Network (RNN) structure for both the encoder and the decoder. Additionally, we introduce a regularization loss to further promote sample diversity. Once trained, our method is able to generate multiple plausible coherent motions by repetitively sampling from the learned latent space. Experiments on the publicly available dataset demonstrate the effectiveness of our method, in terms of sample plausibility and diversity.
Optimal transport (OT) naturally arises in a wide range of machine learning applications but may often become the computational bottleneck. Recently, one line of works propose to solve OT approximately by searching the \emph{transport plan} in a low-rank subspace. However, the optimal transport plan is often not low-rank, which tends to yield large approximation errors. For example, when Monge's \emph{transport map} exists, the transport plan is full rank. This paper concerns the computation of the OT distance with adequate accuracy and efficiency. A novel approximation for OT is proposed, in which the transport plan can be decomposed into the sum of a low-rank matrix and a sparse one. We theoretically analyze the approximation error. An augmented Lagrangian method is then designed to efficiently calculate the transport plan.
With the rapid development of AI technology in recent years, there have been many studies with deep learning models in soft sensing area. However, the models have become more complex, yet, the data sets remain limited: researchers are fitting million-parameter models with hundreds of data samples, which is insufficient to exercise the effectiveness of their models and thus often fail to perform when implemented in industrial applications. To solve this long-lasting problem, we are providing large scale, high dimensional time series manufacturing sensor data from Seagate Technology to the public. We demonstrate the challenges and effectiveness of modeling industrial big data by a Soft Sensing Transformer model on these data sets. Transformer is used because, it has outperformed state-of-the-art techniques in Natural Language Processing, and since then has also performed well in the direct application to computer vision without introduction of image-specific inductive biases. We observe the similarity of a sentence structure to the sensor readings and process the multi-variable sensor readings in a time series in a similar manner of sentences in natural language. The high-dimensional time-series data is formatted into the same shape of embedded sentences and fed into the transformer model. The results show that transformer model outperforms the benchmark models in soft sensing field based on auto-encoder and long short-term memory (LSTM) models. To the best of our knowledge, we are the first team in academia or industry to benchmark the performance of original transformer model with large-scale numerical soft sensing data.
We present a High-Resolution Transformer (HRFormer) that learns high-resolution representations for dense prediction tasks, in contrast to the original Vision Transformer that produces low-resolution representations and has high memory and computational cost. We take advantage of the multi-resolution parallel design introduced in high-resolution convolutional networks (HRNet), along with local-window self-attention that performs self-attention over small non-overlapping image windows, for improving the memory and computation efficiency. In addition, we introduce a convolution into the FFN to exchange information across the disconnected image windows. We demonstrate the effectiveness of the High-Resolution Transformer on both human pose estimation and semantic segmentation tasks, e.g., HRFormer outperforms Swin transformer by $1.3$ AP on COCO pose estimation with $50\%$ fewer parameters and $30\%$ fewer FLOPs. Code is available at: https://github.com/HRNet/HRFormer.
Recovering programs' call graphs is crucial for inter-procedural analysis tasks and applications based on them. The core challenge is recognizing targets of indirect calls (i.e., indirect callees). It becomes more challenging if target programs are in binary forms, due to information loss in binaries. Existing indirect callee recognition solutions for binaries all have high false positives and negatives, making call graphs inaccurate. In this paper, we propose a new solution iCallee based on the Siamese Neural Network, inspired by the advances in question-answering applications. The key insight is that, neural networks can learn to answer whether a callee function is a potential target of an indirect callsite by comprehending their contexts, i.e., instructions nearby callsites and of callees. Following this insight, we first preprocess target binaries to extract contexts of callsites and callees. Then, we build a customized Natural Language Processing (NLP) model applicable to assembly language. Further, we collect abundant pairs of callsites and callees, and embed their contexts with the NLP model, then train a Siamese network and a classifier to answer the callsite-callee question. We have implemented a prototype of iCallee and evaluated it on several groups of targets. Evaluation results showed that, our solution could match callsites to callees with an F1-Measure of 93.7%, recall of 93.8%, and precision of 93.5%, much better than state-of-the-art solutions. To show its usefulness, we apply iCallee to two specific applications - binary code similarity detection and binary program hardening, and found that it could greatly improve state-of-the-art solutions.
Importance: Lower-resource areas in Africa and Asia face a unique set of healthcare challenges: the dual high burden of communicable and non-communicable diseases; a paucity of highly trained primary healthcare providers in both rural and densely populated urban areas; and a lack of reliable, inexpensive internet connections. Objective: To address these challenges, we designed an artificial intelligence assistant to help primary healthcare providers in lower-resource areas document demographic and medical sign/symptom data and to record and share diagnostic data in real-time with a centralized database. Design: We trained our system using multiple data sets, including US-based electronic medical records (EMRs) and open-source medical literature and developed an adaptive, general medical assistant system based on machine learning algorithms. Main outcomes and Measure: The application collects basic information from patients and provides primary care providers with diagnoses and prescriptions suggestions. The application is unique from existing systems in that it covers a wide range of common diseases, signs, and medication typical in lower-resource countries; the application works with or without an active internet connection. Results: We have built and implemented an adaptive learning system that assists trained primary care professionals by means of an Android smartphone application, which interacts with a central database and collects real-time data. The application has been tested by dozens of primary care providers. Conclusions and Relevance: Our application would provide primary healthcare providers in lower-resource areas with a tool that enables faster and more accurate documentation of medical encounters. This application could be leveraged to automatically populate local or national EMR systems.
Many works have investigated the adversarial attacks or defenses under the settings where a bounded and imperceptible perturbation can be added to the input. However in the real-world, the attacker does not need to comply with this restriction. In fact, more threats to the deep model come from unrestricted adversarial examples, that is, the attacker makes large and visible modifications on the image, which causes the model classifying mistakenly, but does not affect the normal observation in human perspective. Unrestricted adversarial attack is a popular and practical direction but has not been studied thoroughly. We organize this competition with the purpose of exploring more effective unrestricted adversarial attack algorithm, so as to accelerate the academical research on the model robustness under stronger unbounded attacks. The competition is held on the TianChi platform (\url{https://tianchi.aliyun.com/competition/entrance/531853/introduction}) as one of the series of AI Security Challengers Program.