Type 2 diabetes (T2DM), one of the most prevalent chronic diseases, affects the glucose metabolism of the human body, which decreases the quantity of life and brings a heavy burden on social medical care. Patients with T2DM are more likely to suffer bone fragility fracture as diabetes affects bone mineral density (BMD). However, the discovery of the determinant factors of BMD in a medical way is expensive and time-consuming. In this paper, we propose a novel algorithm, Prior-Knowledge-driven local Causal structure Learning (PKCL), to discover the underlying causal mechanism between BMD and its factors from the clinical data. Since there exist limited data but redundant prior knowledge for medicine, PKCL adequately utilize the prior knowledge to mine the local causal structure for the target relationship. Combining the medical prior knowledge with the discovered causal relationships, PKCL can achieve more reliable results without long-standing medical statistical experiments. Extensive experiments are conducted on a newly provided clinical data set. The experimental study of PKCL on the data is proved to highly corresponding with existing medical knowledge, which demonstrates the superiority and effectiveness of PKCL. To illustrate the importance of prior knowledge, the result of the algorithm without prior knowledge is also investigated.
Probabilistic graphical models, such as Markov random fields (MRF), exploit dependencies among random variables to model a rich family of joint probability distributions. Sophisticated inference algorithms, such as belief propagation (BP), can effectively compute the marginal posteriors. Nonetheless, it is still difficult to interpret the inference outcomes for important human decision making. There is no existing method to rigorously attribute the inference outcomes to the contributing factors of the graphical models. Shapley values provide an axiomatic framework, but naively computing or even approximating the values on general graphical models is challenging and less studied. We propose GraphShapley to integrate the decomposability of Shapley values, the structure of MRFs, and the iterative nature of BP inference in a principled way for fast Shapley value computation, that 1) systematically enumerates the important contributions to the Shapley values of the explaining variables without duplicate; 2) incrementally compute the contributions without starting from scratches. We theoretically characterize GraphShapley regarding independence, equal contribution, and additivity. On nine graphs, we demonstrate that GraphShapley provides sensible and practical explanations.
Personal mobile sensing is fast permeating our daily lives to enable activity monitoring, healthcare and rehabilitation. Combined with deep learning, these applications have achieved significant success in recent years. Different from conventional cloud-based paradigms, running deep learning on devices offers several advantages including data privacy preservation and low-latency response for both model inference and update. Since data collection is costly in reality, Google's Federated Learning offers not only complete data privacy but also better model robustness based on multiple user data. However, personal mobile sensing applications are mostly user-specific and highly affected by environment. As a result, continuous local changes may seriously affect the performance of a global model generated by Federated Learning. In addition, deploying Federated Learning on a local server, e.g., edge server, may quickly reach the bottleneck due to resource constraint and serious failure by attacks. Towards pushing deep learning on devices, we present MDLdroid, a novel decentralized mobile deep learning framework to enable resource-aware on-device collaborative learning for personal mobile sensing applications. To address resource limitation, we propose a ChainSGD-reduce approach which includes a novel chain-directed Synchronous Stochastic Gradient Descent algorithm to effectively reduce overhead among multiple devices. We also design an agent-based multi-goal reinforcement learning mechanism to balance resources in a fair and efficient manner. Our evaluations show that our model training on off-the-shelf mobile devices achieves 2x to 3.5x faster than single-device training, and 1.5x faster than the master-slave approach.
In many professional fields, such as medicine, remote sensing and sciences, users often demand image compression methods to be mathematically lossless. But lossless image coding has a rather low compression ratio (around 2:1 for natural images). The only known technique to achieve significant compression while meeting the stringent fidelity requirements is the methodology of $\ell_\infty$-constrained coding that was developed and standardized in nineties. We make a major progress in $\ell_\infty$-constrained image coding after two decades, by developing a novel CNN-based soft $\ell_\infty$-constrained decoding method. The new method repairs compression defects by using a restoration CNN called the $\ell_\infty\mbox{-SDNet}$ to map a conventionally decoded image to the latent image. A unique strength of the $\ell_\infty\mbox{-SDNet}$ is its ability to enforce a tight error bound on a per pixel basis. As such, no small distinctive structures of the original image can be dropped or distorted, even if they are statistical outliers that are otherwise sacrificed by mainstream CNN restoration methods. More importantly, this research ushers in a new image compression system of $\ell_\infty$-constrained encoding and deep soft decoding ($\ell_\infty\mbox{-ED}^2$). The $\ell_\infty \mbox{-ED}^2$ approach beats the best of existing lossy image compression methods (e.g., BPG, WebP, etc.) not only in $\ell_\infty$ but also in $\ell_2$ error metric and perceptual quality, for bit rates near the threshold of perceptually transparent reconstruction. Operationally, the new compression system is practical, with a low-complexity real-time encoder and a cascade decoder consisting of a fast initial decoder and an optional CNN soft decoder.
Probabilistic inferences distill knowledge from graphs to aid human make important decisions. Due to the inherent uncertainty in the model and the complexity of the knowledge, it is desirable to help the end-users understand the inference outcomes. Different from deep or high-dimensional parametric models, the lack of interpretability in graphical models is due to the cyclic and long-range dependencies and the byzantine inference procedures. Prior works did not tackle cycles and make \textit{the} inferences interpretable. To close the gap, we formulate the problem of explaining probabilistic inferences as a constrained cross-entropy minimization problem to find simple subgraphs that faithfully approximate the inferences to be explained. We prove that the optimization is NP-hard, while the objective is not monotonic and submodular to guarantee efficient greedy approximation. We propose a general beam search algorithm to find simple trees to enhance the interpretability and diversity in the explanations, with parallelization and a pruning strategy to allow efficient search on large and dense graphs without hurting faithfulness. We demonstrate superior performance on 10 networks from 4 distinct applications, comparing favorably to other explanation methods. Regarding the usability of the explanation, we visualize the explanation in an interface that allows the end-users to explore the diverse search results and find more personalized and sensible explanations.
Given the success of the deep convolutional neural networks (DCNNs) in applications of visual recognition and classification, it would be tantalizing to test if DCNNs can also learn spatial concepts, such as straightness, convexity, left/right, front/back, relative size, aspect ratio, polygons, etc., from varied visual examples of these concepts that are simple and yet vital for spatial reasoning. Much to our dismay, extensive experiments of the type of cognitive psychology demonstrate that the data-driven deep learning (DL) cannot see through superficial variations in visual representations and grasp the spatial concept in abstraction. The root cause of failure turns out to be the learning methodology, not the computational model of the neural network itself. By incorporating task-specific convolutional kernels, we are able to construct DCNNs for spatial cognition tasks that can generalize to input images not drawn from the same distribution of the training set. This work raises a precaution that without manually-incorporated priors or features DCCNs may fail spatial cognitive tasks at rudimentary level.
Combining deep neural networks with reinforcement learning has shown great potential in the next-generation intelligent control. However, there are challenges in terms of safety and cost in practical applications. In this paper, we propose the Intervention Aided Reinforcement Learning (IARL) framework, which utilizes human intervened robot-environment interaction to improve the policy. We used the Unmanned Aerial Vehicle (UAV) as the test platform. We built neural networks as our policy to map sensor readings to control signals on the UAV. Our experiment scenarios cover both simulation and reality. We show that our approach substantially reduces the human intervention and improves the performance in autonomous navigation, at the same time it ensures safety and keeps training cost acceptable.
Deep convolutional neural networks (DCNN) have enjoyed great successes in many signal processing applications because they can learn complex, non-linear causal relationships from input to output. In this light, DCNNs are well suited for the task of sequential prediction of multidimensional signals, such as images, and have the potential of improving the performance of traditional linear predictors. In this research we investigate how far DCNNs can push the envelop in terms of prediction precision. We propose, in a case study, a two-stage deep regression DCNN framework for nonlinear prediction of two-dimensional image signals. In the first-stage regression, the proposed deep prediction network (PredNet) takes the causal context as input and emits a prediction of the present pixel. Three PredNets are trained with the regression objectives of minimizing $\ell_1$, $\ell_2$ and $\ell_\infty$ norms of prediction residuals, respectively. The second-stage regression combines the outputs of the three PredNets to generate an even more precise and robust prediction. The proposed deep regression model is applied to lossless predictive image coding, and it outperforms the state-of-the-art linear predictors by appreciable margin.
Recently a number of CNN-based techniques were proposed to remove image compression artifacts. As in other restoration applications, these techniques all learn a mapping from decompressed patches to the original counterparts under the ubiquitous L2 metric. However, this approach is incapable of restoring distinctive image details which may be statistical outliers but have high semantic importance (e.g., tiny lesions in medical images). To overcome this weakness, we propose to incorporate an L-infinity fidelity criterion in the design of neural network so that no small, distinctive structures of the original image can be dropped or distorted. Moreover, our anti-artifacts neural network is designed to work on a range of compression bit rates, rather than a fixed one as in the past. Experimental results demonstrate that the proposed method outperforms the state-of-the-art methods in L-infinity error metric and perceptual quality, while being competitive in L2 error metric as well. It can restore subtle image details that are otherwise destroyed or missed by other algorithms. Our research suggests a new machine learning paradigm of ultra high fidelity image compression that is ideally suited for applications in medicine, space, and sciences.