Feature extraction plays an important role in Electrocardiogram (ECG) Beats classification system. Compared to other popular methods, VQ method performs well in feature extraction from ECG with advantages of dimensionality reduction. In VQ method, a set of dictionaries corresponding to segments of ECG beats is trained, and VQ codes are used to represent each heartbeat. However, in practice, VQ codes optimized by k-means or k-means++ exist large quantization errors, which results in VQ codes for two heartbeats of the same type being very different. So the essential differences between different types of heartbeats cannot be representative well. On the other hand, VQ uses too much data during codebook construction, which limits the speed of dictionary learning. In this paper, we propose a new method to improve the speed and accuracy of VQ method. To reduce the computation of codebook construction, a set of sparse dictionaries corresponding to wave segments of ECG beats is constructed. After initialized, sparse dictionaries are updated efficiently by Feature-sign and Lagrange dual algorithm. Based on those dictionaries, a set of codes can be computed to represent original ECG beats.Experimental results show that features extracted from ECG by our method are more efficient and separable. The accuracy of our method is higher than other methods with less time consumption of feature extraction
The adoption of deep learning in healthcare is hindered by their "black box" nature. In this paper, we explore the RETAIN architecture for the task of glusose forecasting for diabetic people. By using a two-level attention mechanism, the recurrent-neural-network-based RETAIN model is interpretable. We evaluate the RETAIN model on the type-2 IDIAB and the type-1 OhioT1DM datasets by comparing its statistical and clinical performances against two deep models and three models based on decision trees. We show that the RETAIN model offers a very good compromise between accuracy and interpretability, being almost as accurate as the LSTM and FCN models while remaining interpretable. We show the usefulness of its interpretable nature by analyzing the contribution of each variable to the final prediction. It revealed that signal values older than one hour are not used by the RETAIN model for the 30-minutes ahead of time prediction of glucose. Also, we show how the RETAIN model changes its behavior upon the arrival of an event such as carbohydrate intakes or insulin infusions. In particular, it showed that the patient's state before the event is particularily important for the prediction. Overall the RETAIN model, thanks to its interpretability, seems to be a very promissing model for regression or classification tasks in healthcare.
Recently, the state-of-the-art deep learning methods have demonstrated impressive performance in segmentation tasks. However, the success of these methods depends on a large amount of manually labeled masks, which are expensive and time-consuming to be collected. To reduce the dependence on full labeled samples, we propose a novel Consistent Perception Generative Adversarial Network (CPGAN) for semi-supervised stroke lesion segmentation. Specifically, we design a non-local operation named similarity connection module (SCM) to capture the information of multi-scale features. This module can selectively aggregate the features at each position by a weighted sum. Furthermore, an assistant network is constructed to encourage the discriminator to learn meaningful feature representations which are forgotten during training. The assistant network and the discriminator are used to jointly decide whether the segmentation results are real or fake. With the semi-supervised stroke lesion segmentation, we adopt a consistent perception strategy to enhance the effect of brain stroke lesion prediction for the unlabeled data. The CPGAN was evaluated on the Anatomical Tracings of Lesions After Stroke (ATLAS). The experimental results demonstrate that the proposed network achieves superior segmentation performance. In semi-supervised segmentation task, our method using only two-fifths of labeled samples outperforms some approaches using full labeled samples.
Interactions between people are often governed by their relationships. On the flip side, social relationships are built upon several interactions. Two strangers are more likely to greet and introduce themselves while becoming friends over time. We are fascinated by this interplay between interactions and relationships, and believe that it is an important aspect of understanding social situations. In this work, we propose neural models to learn and jointly predict interactions, relationships, and the pair of characters that are involved. We note that interactions are informed by a mixture of visual and dialog cues, and present a multimodal architecture to extract meaningful information from them. Localizing the pair of interacting characters in video is a time-consuming process, instead, we train our model to learn from clip-level weak labels. We evaluate our models on the MovieGraphs dataset and show the impact of modalities, use of longer temporal context for predicting relationships, and achieve encouraging performance using weak labels as compared with ground-truth labels. Code is online.
This paper studies how to design abstractions of large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers in general purpose ways, and that are amenable to data-driven design. The goal is to arrive at new approaches that can reliably outperform existing solvers in wall-clock time. We focus on solving integer programs, and ground our approach in the large neighborhood search (LNS) paradigm, which iteratively chooses a subset of variables to optimize while leaving the remainder fixed. The appeal of LNS is that it can easily use any existing solver as a subroutine, and thus can inherit the benefits of carefully engineered heuristic approaches and their software implementations. We also show that one can learn a good neighborhood selector from training data. Through an extensive empirical validation, we demonstrate that our LNS framework can significantly outperform, in wall-clock time, compared to state-of-the-art commercial solvers such as Gurobi.
In this paper, we propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties for safer navigation through cluttered environments. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates through each step of our MPC's finite time horizon. The RNN model is trained on a dataset that comprises of robot and landmark poses generated from camera images and inertial measurement unit (IMU) readings via a state-of-the-art visual-inertial odometry framework. To detect and extract object locations for avoidance, we use a custom-trained convolutional neural network model in conjunction with a feature extractor to retrieve 3D centroid and radii boundaries of nearby obstacles. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms, demonstrating autonomous behaviors that can plan fast and collision-free paths towards a goal point.
Cold bent glass is a promising and cost-efficient method for realizing doubly curved glass fa\c{c}ades. They are produced by attaching planar glass sheets to curved frames and require keeping the occurring stress within safe limits. However, it is very challenging to navigate the design space of cold bent glass panels due to the fragility of the material, which impedes the form-finding for practically feasible and aesthetically pleasing cold bent glass fa\c{c}ades. We propose an interactive, data-driven approach for designing cold bent glass fa\c{c}ades that can be seamlessly integrated into a typical architectural design pipeline. Our method allows non-expert users to interactively edit a parametric surface while providing real-time feedback on the deformed shape and maximum stress of cold bent glass panels. Designs are automatically refined to minimize several fairness criteria while maximal stresses are kept within glass limits. We achieve interactive frame rates by using a differentiable Mixture Density Network trained from more than a million simulations. Given a curved boundary, our regression model is capable of handling multistable configurations and accurately predicting the equilibrium shape of the panel and its corresponding maximal stress. We show predictions are highly accurate and validate our results with a physical realization of a cold bent glass surface.
One of the popular methods for distributed machine learning (ML) is federated learning, in which devices train local models based on their datasets, which are in turn aggregated periodically by a server. In large-scale fog networks, the "star" learning topology of federated learning poses several challenges in terms of resource utilization. We develop multi-stage hybrid model training (MH-MT), a novel learning methodology for distributed ML in these scenarios. Leveraging the hierarchical structure of fog systems, MH-MT combines multi-stage parameter relaying with distributed consensus formation among devices in a hybrid learning paradigm across network layers. We theoretically derive the convergence bound of MH-MT with respect to the network topology, ML model, and algorithm parameters such as the rounds of consensus employed in different clusters of devices. We obtain a set of policies for the number of consensus rounds at different clusters to guarantee either a finite optimality gap or convergence to the global optimum. Subsequently, we develop an adaptive distributed control algorithm for MH-MT to tune the number of consensus rounds at each cluster of local devices over time to meet convergence criteria. Our numerical experiments validate the performance of MH-MT in terms of convergence speed and resource utilization.
Multivariate relations are general in various types of networks, such as biological networks, social networks, transportation networks, and academic networks. Due to the principle of ternary closures and the trend of group formation, the multivariate relationships in social networks are complex and rich. Therefore, in graph learning tasks of social networks, the identification and utilization of multivariate relationship information are more important. Existing graph learning methods are based on the neighborhood information diffusion mechanism, which often leads to partial omission or even lack of multivariate relationship information, and ultimately affects the accuracy and execution efficiency of the task. To address these challenges, this paper proposes the multivariate relationship aggregation learning (MORE) method, which can effectively capture the multivariate relationship information in the network environment. By aggregating node attribute features and structural features, MORE achieves higher accuracy and faster convergence speed. We conducted experiments on one citation network and five social networks. The experimental results show that the MORE model has higher accuracy than the GCN (Graph Convolutional Network) model in node classification tasks, and can significantly reduce time cost.
Despite the importance of sparsity signal models and the increasing prevalence of high-dimensional streaming data, there are relatively few algorithms for dynamic filtering of time-varying sparse signals. Of the existing algorithms, fewer still provide strong performance guarantees. This paper examines two algorithms for dynamic filtering of sparse signals that are based on efficient l1 optimization methods. We first present an analysis for one simple algorithm (BPDN-DF) that works well when the system dynamics are known exactly. We then introduce a novel second algorithm (RWL1-DF) that is more computationally complex than BPDN-DF but performs better in practice, especially in the case where the system dynamics model is inaccurate. Robustness to model inaccuracy is achieved by using a hierarchical probabilistic data model and propagating higher-order statistics from the previous estimate (akin to Kalman filtering) in the sparse inference process. We demonstrate the properties of these algorithms on both simulated data as well as natural video sequences. Taken together, the algorithms presented in this paper represent the first strong performance analysis of dynamic filtering algorithms for time-varying sparse signals as well as state-of-the-art performance in this emerging application.