Deep reinforcement learning offers a model-free alternative to supervised deep learning and classical optimization for solving the transmit power control problem in wireless networks. The multi-agent deep reinforcement learning approach considers each transmitter as an individual learning agent that determines its transmit power level by observing the local wireless environment. Following a certain policy, these agents learn to collaboratively maximize a global objective, e.g., a sum-rate utility function. This multi-agent scheme is easily scalable and practically applicable to large-scale cellular networks. In this work, we present a distributively executed continuous power control algorithm with the help of deep actor-critic learning, and more specifically, by adapting deep deterministic policy gradient. Furthermore, we integrate the proposed power control algorithm to a time-slotted system where devices are mobile and channel conditions change rapidly. We demonstrate the functionality of the proposed algorithm using simulation results.
We introduce $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search. This approach requires no additional training and scales to give the decoder direct access to billions of examples at test time, resulting in a highly expressive model that consistently improves performance across many settings. Simply adding nearest neighbor search improves a state-of-the-art German-English translation model by 1.5 BLEU. $k$NN-MT allows a single model to be adapted to diverse domains by using a domain-specific datastore, improving results by an average of 9.2 BLEU over zero-shot transfer, and achieving new state-of-the-art results---without training on these domains. A massively multilingual model can also be specialized for particular language pairs, with improvements of 3 BLEU for translating from English into German and Chinese. Qualitatively, $k$NN-MT is easily interpretable; it combines source and target context to retrieve highly relevant examples.
We present an algorithm for multi-scale tumor (chimeric cell) detection in high resolution slide scans. The broad range of tumor sizes in our dataset pose a challenge for current Convolutional Neural Networks (CNN) which often fail when image features are very small (8 pixels). Our approach modifies the effective receptive field at different layers in a CNN so that objects with a broad range of varying scales can be detected in a single forward pass. We define rules for computing adaptive prior anchor boxes which we show are solvable under the equal proportion interval principle. Two mechanisms in our CNN architecture alleviate the effects of non-discriminative features prevalent in our data - a foveal detection algorithm that incorporates a cascade residual-inception module and a deconvolution module with additional context information. When integrated into a Single Shot MultiBox Detector (SSD), these additions permit more accurate detection of small-scale objects. The results permit efficient real-time analysis of medical images in pathology and related biomedical research fields.
Recent advances in open-domain dialogue systems rely on the success of neural models that are trained on large-scale data. However, collecting large-scale dialogue data is usually time-consuming and labor-intensive. To address this data dilemma, we propose a novel data augmentation method for training open-domain dialogue models by utilizing unpaired data. Specifically, a data-level distillation process is first proposed to construct augmented dialogues where both post and response are retrieved from the unpaired data. A ranking module is employed to filter out low-quality dialogues. Further, a model-level distillation process is employed to distill a teacher model trained on high-quality paired data to augmented dialogue pairs, thereby preventing dialogue models from being affected by the noise in the augmented data. Automatic and manual evaluation indicates that our method can produce high-quality dialogue pairs with diverse contents, and the proposed data-level and model-level dialogue distillation can improve the performance of competitive baselines.
To combat the coronavirus disease 2019 (COVID-19) pandemic, the world has vaccination, plasma therapy, herd immunity, and epidemiological interventions as few possible options. The COVID-19 vaccine development is underway and it may take a significant amount of time to develop the vaccine and after development, it will take time to vaccinate the entire population, and plasma therapy has some limitations. Herd immunity can be a plausible option to fight COVID-19 for small countries. But for a country with huge population like India, herd immunity is not a plausible option, because to acquire herd immunity approximately 67% of the population has to be recovered from COVID-19 infection, which will put an extra burden on medical system of the country and will result in a huge loss of human life. Thus epidemiological interventions (complete lockdown, partial lockdown, quarantine, isolation, social distancing, etc.) are some suitable strategies in India to slow down the COVID-19 spread until the vaccine development. In this work, we have suggested the SIR model with intervention, which incorporates the epidemiological interventions in the classical SIR model. To model the effect of the interventions, we have introduced \r{ho} as the intervention parameter. \r{ho} is a cumulative quantity which covers all type of intervention. We have also discussed the supervised machine learning approach to estimate the transmission rate (\b{eta}) for the SIR model with intervention from the prevalence of COVID-19 data in India and some states of India. To validate our model, we present a comparison between the actual and model-predicted number of COVID-19 cases. Using our model, we also present predicted numbers of active and recovered COVID-19 cases till Sept 30, 2020, for entire India and some states of India and also estimate the 95% and 99% confidence interval for the predicted cases.
Floods are among the most frequent and catastrophic natural disasters and affect millions of people worldwide. It is important to create accurate flood maps to plan (offline) and conduct (real-time) flood mitigation and flood rescue operations. Arguably, images collected from social media can provide useful information for that task, which would otherwise be unavailable. We introduce a computer vision system that estimates water depth from social media images taken during flooding events, in order to build flood maps in (near) real-time. We propose a multi-task (deep) learning approach, where a model is trained using both a regression and a pairwise ranking loss. Our approach is motivated by the observation that a main bottleneck for image-based flood level estimation is training data: it is diffcult and requires a lot of effort to annotate uncontrolled images with the correct water depth. We demonstrate how to effciently learn a predictor from a small set of annotated water levels and a larger set of weaker annotations that only indicate in which of two images the water level is higher, and are much easier to obtain. Moreover, we provide a new dataset, named DeepFlood, with 8145 annotated ground-level images, and show that the proposed multi-task approach can predict the water level from a single, crowd-sourced image with ~11 cm root mean square error.
Inverse problems spanning four or more dimensions such as space, time and other independent parameters have become increasingly important. State-of-the-art 4D reconstruction methods use model based iterative reconstruction (MBIR), but depend critically on the quality of the prior modeling. Recently, plug-and-play (PnP) methods have been shown to be an effective way to incorporate advanced prior models using state-of-the-art denoising algorithms. However, state-of-the-art denoisers such as BM4D and deep convolutional neural networks (CNNs) are primarily available for 2D or 3D images and extending them to higher dimensions is difficult due to algorithmic complexity and the increased difficulty of effective training. In this paper, we present multi-slice fusion, a novel algorithm for 4D reconstruction, based on the fusion of multiple low-dimensional denoisers. Our approach uses multi-agent consensus equilibrium (MACE), an extension of plug-and-play, as a framework for integrating the multiple lower-dimensional models. We apply our method to 4D cone-beam X-ray CT reconstruction for non destructive evaluation (NDE) of samples that are dynamically moving during acquisition. We implement multi-slice fusion on distributed, heterogeneous clusters in order to reconstruct large 4D volumes in reasonable time and demonstrate the inherent parallelizable nature of the algorithm. We present simulated and real experimental results on sparse-view and limited-angle CT data to demonstrate that multi-slice fusion can substantially improve the quality of reconstructions relative to traditional methods, while also being practical to implement and train.
The lack of interpretability and transparency are preventing economists from using advanced tools like neural networks in their empirical work. In this paper, we propose a new class of interpretable neural network models that can achieve both high prediction accuracy and interpretability in regression problems with time series cross-sectional data. Our model can essentially be written as a simple function of a limited number of interpretable features. In particular, we incorporate a class of interpretable functions named persistent change filters as part of the neural network. We apply this model to predicting individual's monthly employment status using high-dimensional administrative data in China. We achieve an accuracy of 94.5% on the out-of-sample test set, which is comparable to the most accurate conventional machine learning methods. Furthermore, the interpretability of the model allows us to understand the mechanism that underlies the ability for predicting employment status using administrative data: an individual's employment status is closely related to whether she pays different types of insurances. Our work is a useful step towards overcoming the "black box" problem of neural networks, and provide a promising new tool for economists to study administrative and proprietary big data.
Real time traffic navigation is an important capability in smart transportation technologies, which has been extensively studied these years. Due to the vast development of edge devices, collecting real time traffic data is no longer a problem. However, real traffic navigation is still considered to be a particularly challenging problem because of the time-varying patterns of the traffic flow and unpredictable accidents/congestion. To give accurate and reliable navigation results, predicting the future traffic flow(speed,congestion,volume,etc) in a fast and accurate way is of great importance. In this paper, we adopt the ideas of ensemble learning and develop a two-stage machine learning model to give accurate navigation results. We model the traffic flow as a time series and apply XGBoost algorithm to get accurate predictions on future traffic conditions(1st stage). We then apply the Top K Dijkstra algorithm to find a set of shortest paths from the give start point to the destination as the candidates of the output optimal path. With the prediction results in the 1st stage, we find one optimal path from the candidates as the output of the navigation algorithm. We show that our navigation algorithm can be greatly improved via EOPF(Enhanced Optimal Path Finding), which is based on neural network(2nd stage). We show that our method can be over 7% better than the method without EOPF in many situations, which indicates the effectiveness of our model.
The accuracy of deep convolutional neural networks (CNNs) generally improves when fueled with high resolution images. However, this often comes at a high computational cost and high memory footprint. Inspired by the fact that not all regions in an image are task-relevant, we propose a novel framework that performs efficient image classification by processing a sequence of relatively small inputs, which are strategically selected from the original image with reinforcement learning. Such a dynamic decision process naturally facilitates adaptive inference at test time, i.e., it can be terminated once the model is sufficiently confident about its prediction and thus avoids further redundant computation. Notably, our framework is general and flexible as it is compatible with most of the state-of-the-art light-weighted CNNs (such as MobileNets, EfficientNets and RegNets), which can be conveniently deployed as the backbone feature extractor. Experiments on ImageNet show that our method consistently improves the computational efficiency of a wide variety of deep models. For example, it further reduces the average latency of the highly efficient MobileNet-V3 on an iPhone XS Max by 20% without sacrificing accuracy. Code and pre-trained models are available at https://github.com/blackfeather-wang/GFNet-Pytorch.