A mega-constellation of low-earth orbit (LEO) satellites has the potential to enable long-range communication with low latency. Integrating this with burgeoning unmanned aerial vehicle (UAV) assisted non-terrestrial networks will be a disruptive solution for beyond 5G systems provisioning large scale three-dimensional connectivity. In this article, we study the problem of forwarding packets between two faraway ground terminals, through an LEO satellite selected from an orbiting constellation and a mobile high-altitude platform (HAP) such as a fixed-wing UAV. To maximize the end-to-end data rate, the satellite association and HAP location should be optimized, which is challenging due to a huge number of orbiting satellites and the resulting time-varying network topology. We tackle this problem using deep reinforcement learning (DRL) with a novel action dimension reduction technique. Simulation results corroborate that our proposed method achieves up to 5.74x higher average data rate compared to a direct communication baseline without SAT and HAP.
Adversarial training is one of the most popular ways to learn robust models but is usually attack-dependent and time costly. In this paper, we propose the MACER algorithm, which learns robust models without using adversarial training but performs better than all existing provable l2-defenses. Recent work shows that randomized smoothing can be used to provide a certified l2 radius to smoothed classifiers, and our algorithm trains provably robust smoothed classifiers via MAximizing the CErtified Radius (MACER). The attack-free characteristic makes MACER faster to train and easier to optimize. In our experiments, we show that our method can be applied to modern deep neural networks on a wide range of datasets, including Cifar-10, ImageNet, MNIST, and SVHN. For all tasks, MACER spends less training time than state-of-the-art adversarial training algorithms, and the learned models achieve larger average certified radius.
Electronic medical record (EMR) data contains historical sequences of visits of patients, and each visit contains rich information, such as patient demographics, hospital utilisation and medical codes, including diagnosis, procedure and medication codes. Most existing EMR embedding methods capture visit-code associations by constructing input visit representations as binary vectors with a static vocabulary of medical codes. With this limited representation, they fail in encapsulating rich attribute information of visits (demographics and utilisation information) and/or codes (e.g., medical code descriptions). Furthermore, current work considers visits of the same patient as discrete-time events and ignores time gaps between them. However, the time gaps between visits depict dynamics of the patient's medical history inducing varying influences on future visits. To address these limitations, we present $\mathtt{MedGraph}$, a supervised EMR embedding method that captures two types of information: (1) the visit-code associations in an attributed bipartite graph, and (2) the temporal sequencing of visits through point processes. $\mathtt{MedGraph}$ produces Gaussian embeddings for visits and codes to model the uncertainty. We evaluate the performance of $\mathtt{MedGraph}$ through an extensive experimental study and show that $\mathtt{MedGraph}$ outperforms state-of-the-art EMR embedding methods in several medical risk prediction tasks.
The multiple fundamental frequency detection problem and the source separation problem from a single-channel signal containing multiple oscillatory components and a nonstationary noise are both challenging tasks. To extract the fetal electrocardiogram (ECG) from a single-lead maternal abdominal ECG, we face both challenges. In this paper, we propose a novel method to extract the fetal ECG signal from the single channel maternal abdominal ECG signal, without any additional measurement. The algorithm is composed of three main ingredients. First, the maternal and fetal heart rates are estimated by the de-shape short time Fourier transform, which is a recently proposed nonlinear time-frequency analysis technique; second, the beat tracking technique is applied to accurately obtain the maternal and fetal R peaks; third, the maternal and fetal ECG waveforms are established by the nonlocal median. The algorithm is evaluated on a simulated fetal ECG signal database ({\em fecgsyn} database), and tested on two real databases with the annotation provided by experts ({\em adfecgdb} database and {\em CinC2013} database). In general, the algorithm could be applied to solve other detection and source separation problems, and reconstruct the time-varying wave-shape function of each oscillatory component.
Deep neural networks with more parameters and FLOPs have higher capacity and generalize better to diverse domains. But to be deployed on edge devices, the model's complexity has to be constrained due to limited compute resource. In this work, we propose a method to improve the model capacity without increasing inference-time complexity. Our method is based on an assumption of data locality: for an edge device, within a short period of time, the input data to the device are sampled from a single domain with relatively low diversity. Therefore, it is possible to utilize a specialized, low-complexity model to achieve good performance in that input domain. To leverage this, we propose Domain-aware Dynamic Network (DDN), which is a high-capacity dynamic network in which each layer contains multiple weights. During inference, based on the input domain, DDN dynamically combines those weights into one single weight that specializes in the given domain. This way, DDN can keep the inference-time complexity low but still maintain a high capacity. Experiments show that without increasing the parameters, FLOPs, and actual latency, DDN achieves up to 2.6\% higher AP50 than a static network on the BDD100K object-detection benchmark.
Model selection consists in comparing several candidate models according to a metric to be optimized. The process often involves a grid search, or such, and cross-validation, which can be time consuming, as well as not providing much information about the dataset itself. In this paper we propose a method to reduce the scope of exploration needed for the task. The idea is to quantify how much it would be necessary to depart from trained instances of a given family, reference models (RMs) carrying `rigid' decision boundaries (e.g. decision trees), so as to obtain an equivalent or better model. In our approach, this is realized by progressively relaxing the decision boundaries of the initial decision trees (the RMs) as long as this is beneficial in terms of performance measured on an analyzed dataset. More specifically, this relaxation is performed by making use of a neural decision tree, which is a neural network built from DTs. The final model produced by our method carries non-linear decision boundaries. Measuring the performance of the final model, and its agreement to its seeding RM can help the user to figure out on which family of models he should focus on.
The paper focuses on the problem of vision-based obstacle detection and tracking for unmanned aerial vehicle navigation. A real-time object localization and tracking strategy from monocular image sequences is developed by effectively integrating the object detection and tracking into a dynamic Kalman model. At the detection stage, the object of interest is automatically detected and localized from a saliency map computed via the image background connectivity cue at each frame; at the tracking stage, a Kalman filter is employed to provide a coarse prediction of the object state, which is further refined via a local detector incorporating the saliency map and the temporal information between two consecutive frames. Compared to existing methods, the proposed approach does not require any manual initialization for tracking, runs much faster than the state-of-the-art trackers of its kind, and achieves competitive tracking performance on a large number of image sequences. Extensive experiments demonstrate the effectiveness and superior performance of the proposed approach.
We study "learning-based" sketching approaches for diverse tasks in numerical linear algebra: least-squares regression, $\ell_p$ regression, Huber regression, low-rank approximation (LRA), and $k$-means clustering. Sketching methods are used to quickly and approximately compute properties of large matrices. Linear maps called "sketches" are applied to compress data, and these concise representations are used to compute the desired properties. Specifically, we consider sparse sketches (such as CountSketch). Recent works have dealt with optimizing sketches for data distributions to perform better than their random counterparts. We extend this theme to several important and ubiquitous tasks, each of which requires a new analysis and novel practical methods. Specifically, our contributions are: 1) For all tasks, we introduce fast algorithms using learned sketches with worst-case guarantees. We give a simple task-agnostic method for retaining the worst-case guarantees of randomized sketching, which yields time-optimal algorithms for LRA and least-squares regression. Also, for $k$-means clustering, we give a faster alternative for retaining worst-case guarantees. 2) We show empirically that learned sketches are reliable in improving approximation accuracy, with comparison against "non-learned" sketching baselines. 3) We introduce a greedy algorithm for optimizing the location of the nonzero entries of a sparse sketch and prove guarantees for certain distributions on the LRA task. Previous work only looked at optimizing the values rather than the locations. Also, we show empirically that it further improves learned sketch performance.
In this paper we address the problem of predicting a time series using the ARMA (autoregressive moving average) model, under minimal assumptions on the noise terms. Using regret minimization techniques, we develop effective online learning algorithms for the prediction problem, without assuming that the noise terms are Gaussian, identically distributed or even independent. Furthermore, we show that our algorithm's performances asymptotically approaches the performance of the best ARMA model in hindsight.
Current state of the art methods for generating semantic segmentation rely heavily on a large set of images that have each pixel labeled with a class of interest label or background. Coming up with such labels, especially in domains that require an expert to do annotations, comes at a heavy cost in time and money. Several methods have shown that we can learn semantic segmentation from less expensive image-level labels, but the effectiveness of point level labels, a healthy compromise between all pixels labelled and none, still remains largely unexplored. This paper presents a novel procedure for producing semantic segmentation from images given some point level annotations. This method includes point annotations in the training of a convolutional neural network (CNN) for producing improved localization and class activation maps. Then, we use another CNN for predicting semantic affinities in order to propagate rough class labels and create pseudo semantic segmentation labels. Finally, we propose training a CNN that is normally fully supervised using our pseudo labels in place of ground truth labels, which further improves performance and simplifies the inference process by requiring just one CNN during inference rather than two. Our method achieves state of the art results for point supervised semantic segmentation on the PASCAL VOC 2012 dataset \cite{everingham2010pascal}, even outperforming state of the art methods for stronger bounding box and squiggle supervision.