Lifted inference exploits symmetries in probabilistic graphical models by using a representative for indistinguishable objects, thereby speeding up query answering while maintaining exact answers. Even though lifting is a well-established technique for the task of probabilistic inference in relational domains, it has not yet been applied to the task of causal inference. In this paper, we show how lifting can be applied to efficiently compute causal effects in relational domains. More specifically, we introduce parametric causal factor graphs as an extension of parametric factor graphs incorporating causal knowledge and give a formal semantics of interventions therein. We further present the lifted causal inference algorithm to compute causal effects on a lifted level, thereby drastically speeding up causal inference compared to propositional inference, e.g., in causal Bayesian networks. In our empirical evaluation, we demonstrate the effectiveness of our approach.
In this report we explore the application of the Lagrange-Newton method to the SAM (smoothing-and-mapping) problem in mobile robotics. In Lagrange-Newton SAM, the angular component of each pose vector is expressed by orientation vectors and treated through Lagrange constraints. This is different from the typical Gauss-Newton approach where variations need to be mapped back and forth between Euclidean space and a manifold suitable for rotational components. We derive equations for five different types of measurements between robot poses: translation, distance, and rotation from odometry in the plane, as well as home-vector angle and compass angle from visual homing. We demonstrate the feasibility of the Lagrange-Newton approach for a simple example related to a cleaning robot scenario.
Lifted probabilistic inference exploits symmetries in a probabilistic model to allow for tractable probabilistic inference with respect to domain sizes. To apply lifted inference, a lifted representation has to be obtained, and to do so, the so-called colour passing algorithm is the state of the art. The colour passing algorithm, however, is bound to a specific inference algorithm and we found that it ignores commutativity of factors while constructing a lifted representation. We contribute a modified version of the colour passing algorithm that uses logical variables to construct a lifted representation independent of a specific inference algorithm while at the same time exploiting commutativity of factors during an offline-step. Our proposed algorithm efficiently detects more symmetries than the state of the art and thereby drastically increases compression, yielding significantly faster online query times for probabilistic inference when the resulting model is applied.
We describe a Lagrange-Newton framework for the derivation of learning rules with desirable convergence properties and apply it to the case of principal component analysis (PCA). In this framework, a Newton descent is applied to an extended variable vector which also includes Lagrange multipliers introduced with constraints. The Newton descent guarantees equal convergence speed from all directions, but is also required to produce stable fixed points in the system with the extended state vector. The framework produces "coupled" PCA learning rules which simultaneously estimate an eigenvector and the corresponding eigenvalue in cross-coupled differential equations. We demonstrate the feasibility of this approach for two PCA learning rules, one for the estimation of the principal, the other for the estimate of an arbitrary eigenvector-eigenvalue pair (eigenpair).
DNA-based nanonetworks have a wide range of promising use cases, especially in the field of medicine. With a large set of agents, a partially observable stochastic environment, and noisy observations, such nanoscale systems can be modelled as a decentralised, partially observable, Markov decision process (DecPOMDP). As the agent set is a dominating factor, this paper presents (i) lifted DecPOMDPs, partitioning the agent set into sets of indistinguishable agents, reducing the worst-case space required, and (ii) a nanoscale medical system as an application. Future work turns to solving and implementing lifted DecPOMDPs.
Fully symmetric learning rules for principal component analysis can be derived from a novel objective function suggested in our previous work. We observed that these learning rules suffer from slow convergence for covariance matrices where some principal eigenvalues are close to each other. Here we describe a modified objective function with an additional term which mitigates this convergence problem. We show that the learning rule derived from the modified objective function inherits all fixed points from the original learning rule (but may introduce additional ones). Also the stability of the inherited fixed points remains unchanged. Only the steepness of the objective function is increased in some directions. Simulations confirm that the convergence speed can be noticeably improved, depending on the weight factor of the additional term.
Neural learning rules for principal component / subspace analysis (PCA / PSA) can be derived by maximizing an objective function (summed variance of the projection on the subspace axes) under an orthonormality constraint. For a subspace with a single axis, the optimization produces the principal eigenvector of the data covariance matrix. Hierarchical learning rules with deflation procedures can then be used to extract multiple eigenvectors. However, for a subspace with multiple axes, the optimization leads to PSA learning rules which only converge to axes spanning the principal subspace but not to the principal eigenvectors. A modified objective function with distinct weight factors had to be introduced produce PCA learning rules. Optimization of the objective function for multiple axes leads to symmetric learning rules which do not require deflation procedures. For the PCA case, the estimated principal eigenvectors are ordered (w.r.t. the corresponding eigenvalues) depending on the order of the weight factors. Here we introduce an alternative objective function where it is not necessary to introduce fixed weight factors; instead, the alternative objective function uses squared summands. Optimization leads to symmetric PCA learning rules which converge to the principal eigenvectors, but without imposing an order. In place of the diagonal matrices with fixed weight factors, variable diagonal matrices appear in the learning rules. We analyze this alternative approach by determining the fixed points of the constrained optimization. The behavior of the constrained objective function at the fixed points is analyzed which confirms both the PCA behavior and the fact that no order is imposed. Different ways to derive learning rules from the optimization of the objective function are presented. The role of the terms in the learning rules obtained from these derivations is explored.
In coupled learning rules for PCA (principal component analysis) and SVD (singular value decomposition), the update of the estimates of eigenvectors or singular vectors is influenced by the estimates of eigenvalues or singular values, respectively. This coupled update mitigates the speed-stability problem since the update equations converge from all directions with approximately the same speed. A method to derive coupled learning rules from information criteria by Newton optimization is known. However, these information criteria have to be designed, offer no explanatory value, and can only impose Euclidean constraints on the vector estimates. Here we describe an alternative approach where coupled PCA and SVD learning rules can systematically be derived from a Newton zero-finding framework. The derivation starts from an objective function, combines the equations for its extrema with arbitrary constraints on the vector estimates, and solves the resulting vector zero-point equation using Newton's zero-finding method. To demonstrate the framework, we derive PCA and SVD learning rules with constant Euclidean length or constant sum of the vector estimates.
Large probabilistic models are often shaped by a pool of known individuals (a universe) and relations between them. Lifted inference algorithms handle sets of known individuals for tractable inference. Universes may not always be known, though, or may only described by assumptions such as "small universes are more likely". Without a universe, inference is no longer possible for lifted algorithms, losing their advantage of tractable inference. The aim of this paper is to define a semantics for models with unknown universes decoupled from a specific constraint language to enable lifted and thereby, tractable inference.