Deep convolutional neural networks (DCNNs) trained on a large number of images with strong pixel-level annotations have recently significantly pushed the state-of-art in semantic image segmentation. We study the more challenging problem of learning DCNNs for semantic image segmentation from either (1) weakly annotated training data such as bounding boxes or image-level labels or (2) a combination of few strongly labeled and many weakly labeled images, sourced from one or multiple datasets. We develop Expectation-Maximization (EM) methods for semantic image segmentation model training under these weakly supervised and semi-supervised settings. Extensive experimental evaluation shows that the proposed techniques can learn models delivering competitive results on the challenging PASCAL VOC 2012 image segmentation benchmark, while requiring significantly less annotation effort. We share source code implementing the proposed system at https://bitbucket.org/deeplab/deeplab-public.
Relational machine learning studies methods for the statistical analysis of relational, or graph-structured, data. In this paper, we provide a review of how such statistical models can be "trained" on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph). In particular, we discuss two fundamentally different kinds of statistical relational models, both of which can scale to massive datasets. The first is based on latent feature models such as tensor factorization and multiway neural networks. The second is based on mining observable patterns in the graph. We also show how to combine these latent and observable models to get improved modeling power at decreased computational cost. Finally, we discuss how such statistical models of graphs can be combined with text-based information extraction methods for automatically constructing knowledge graphs from the Web. To this end, we also discuss Google's Knowledge Vault project as an example of such combination.
The field of neuroimaging has truly become data rich, and novel analytical methods capable of gleaning meaningful information from large stores of imaging data are in high demand. Those methods that might also be applicable on the level of individual subjects, and thus potentially useful clinically, are of special interest. In the present study, we introduce just such a method, called nonlinear functional mapping (NFM), and demonstrate its application in the analysis of resting state fMRI from a 242-subject subset of the IMAGEN project, a European study of adolescents that includes longitudinal phenotypic, behavioral, genetic, and neuroimaging data. NFM employs a computational technique inspired by biological evolution to discover and mathematically characterize interactions among ROI (regions of interest), without making linear or univariate assumptions. We show that statistics of the resulting interaction relationships comport with recent independent work, constituting a preliminary cross-validation. Furthermore, nonlinear terms are ubiquitous in the models generated by NFM, suggesting that some of the interactions characterized here are not discoverable by standard linear methods of analysis. We discuss one such nonlinear interaction in the context of a direct comparison with a procedure involving pairwise correlation, designed to be an analogous linear version of functional mapping. We find another such interaction that suggests a novel distinction in brain function between drinking and non-drinking adolescents: a tighter coupling of ROI associated with emotion, reward, and interoceptive processes such as thirst, among drinkers. Finally, we outline many improvements and extensions of the methodology to reduce computational expense, complement other analytical tools like graph-theoretic analysis, and allow for voxel level NFM to eliminate the necessity of ROI selection.
We present a novel method for aligning a sequence of instructions to a video of someone carrying out a task. In particular, we focus on the cooking domain, where the instructions correspond to the recipe. Our technique relies on an HMM to align the recipe steps to the (automatically generated) speech transcript. We then refine this alignment using a state-of-the-art visual food detector, based on a deep convolutional neural network. We show that our technique outperforms simpler techniques based on keyword spotting. It also enables interesting applications, such as automatically illustrating recipes with keyframes, and searching within a video for events of interest.
This is the Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, which was held on Catalina Island, CA August 14-18 2012.
Dynamic probabilistic networks are a compact representation of complex stochastic processes. In this paper we examine how to learn the structure of a DPN from data. We extend structure scoring rules for standard probabilistic networks to the dynamic case, and show how to search for structure when some of the variables are hidden. Finally, we examine two applications where such a technology might be useful: predicting and classifying dynamic behaviors, and learning causal orderings in biological processes. We provide empirical results that demonstrate the applicability of our methods in both domains.
Recently, researchers have demonstrated that loopy belief propagation - the use of Pearls polytree algorithm IN a Bayesian network WITH loops OF error- correcting codes.The most dramatic instance OF this IS the near Shannon - limit performance OF Turbo Codes codes whose decoding algorithm IS equivalent TO loopy belief propagation IN a chain - structured Bayesian network. IN this paper we ask : IS there something special about the error - correcting code context, OR does loopy propagation WORK AS an approximate inference schemeIN a more general setting? We compare the marginals computed using loopy propagation TO the exact ones IN four Bayesian network architectures, including two real - world networks : ALARM AND QMR.We find that the loopy beliefs often converge AND WHEN they do, they give a good approximation TO the correct marginals.However,ON the QMR network, the loopy beliefs oscillated AND had no obvious relationship TO the correct posteriors. We present SOME initial investigations INTO the cause OF these oscillations, AND show that SOME simple methods OF preventing them lead TO the wrong results.
We show how to use a variational approximation to the logistic function to perform approximate inference in Bayesian networks containing discrete nodes with continuous parents. Essentially, we convert the logistic function to a Gaussian, which facilitates exact inference, and then iteratively adjust the variational parameters to improve the quality of the approximation. We demonstrate experimentally that this approximation is faster and potentially more accurate than sampling. We also introduce a simple new technique for handling evidence, which allows us to handle arbitrary distributions on observed nodes, as well as achieving a significant speedup in networks with discrete variables of large cardinality.
Particle filters (PFs) are powerful sampling-based inference/learning algorithms for dynamic Bayesian networks (DBNs). They allow us to treat, in a principled way, any type of probability distribution, nonlinearity and non-stationarity. They have appeared in several fields under such names as "condensation", "sequential Monte Carlo" and "survival of the fittest". In this paper, we show how we can exploit the structure of the DBN to increase the efficiency of particle filtering, using a technique known as Rao-Blackwellisation. Essentially, this samples some of the variables, and marginalizes out the rest exactly, using the Kalman filter, HMM filter, junction tree algorithm, or any other finite dimensional optimal filter. We show that Rao-Blackwellised particle filters (RBPFs) lead to more accurate estimates than standard PFs. We demonstrate RBPFs on two problems, namely non-stationary online regression with radial basis function networks and robot localization and map building. We also discuss other potential application areas and provide references to some finite dimensional optimal filters.
The Factored Frontier (FF) algorithm is a simple approximate inferencealgorithm for Dynamic Bayesian Networks (DBNs). It is very similar tothe fully factorized version of the Boyen-Koller (BK) algorithm, butinstead of doing an exact update at every step followed bymarginalisation (projection), it always works with factoreddistributions. Hence it can be applied to models for which the exactupdate step is intractable. We show that FF is equivalent to (oneiteration of) loopy belief propagation (LBP) on the original DBN, andthat BK is equivalent (to one iteration of) LBP on a DBN where wecluster some of the nodes. We then show empirically that byiterating, LBP can improve on the accuracy of both FF and BK. Wecompare these algorithms on two real-world DBNs: the first is a modelof a water treatment plant, and the second is a coupled HMM, used tomodel freeway traffic.