Open domain Question Answering (QA) systems must interact with external knowledge sources, such as web pages, to find relevant information. Information sources like Wikipedia, however, are not well structured and difficult to utilize in comparison with Knowledge Bases (KBs). In this work we present a two-step approach to question answering from unstructured text, consisting of a retrieval step and a comprehension step. For comprehension, we present an RNN based attention model with a novel mixture mechanism for selecting answers from either retrieved articles or a fixed vocabulary. For retrieval we introduce a hand-crafted model and a neural model for ranking relevant articles. We achieve state-of-the-art performance on W IKI M OVIES dataset, reducing the error by 40%. Our experimental results further demonstrate the importance of each of the introduced components.
We propose a simple domain adaptation method for neural networks in a supervised setting. Supervised domain adaptation is a way of improving the generalization performance on the target domain by using the source domain dataset, assuming that both of the datasets are labeled. Recently, recurrent neural networks have been shown to be successful on a variety of NLP tasks such as caption generation; however, the existing domain adaptation techniques are limited to (1) tune the model parameters by the target dataset after the training by the source dataset, or (2) design the network to have dual output, one for the source domain and the other for the target domain. Reformulating the idea of the domain adaptation technique proposed by Daume (2007), we propose a simple domain adaptation method, which can be applied to neural networks trained with a cross-entropy loss. On captioning datasets, we show performance improvements over other domain adaptation methods.
We propose a new approach to the theoretical analysis of Loopy Belief Propagation (LBP) and the Bethe free energy (BFE) by establishing a formula to connect LBP and BFE with a graph zeta function. The proposed approach is applicable to a wide class of models including multinomial and Gaussian types. The connection derives a number of new theoretical results on LBP and BFE. This paper focuses two of such topics. One is the analysis of the region where the Hessian of the Bethe free energy is positive definite, which derives the non-convexity of BFE for graphs with multiple cycles, and a condition of convexity on a restricted set. This analysis also gives a new condition for the uniqueness of the LBP fixed point. The other result is to clarify the relation between the local stability of a fixed point of LBP and local minima of the BFE, which implies, for example, that a locally stable fixed point of the Gaussian LBP is a local minimum of the Gaussian Bethe free energy.
We consider computation of permanent of a positive $(N\times N)$ non-negative matrix, $P=(P_i^j|i,j=1,\cdots,N)$, or equivalently the problem of weighted counting of the perfect matchings over the complete bipartite graph $K_{N,N}$. The problem is known to be of likely exponential complexity. Stated as the partition function $Z$ of a graphical model, the problem allows exact Loop Calculus representation [Chertkov, Chernyak '06] in terms of an interior minimum of the Bethe Free Energy functional over non-integer doubly stochastic matrix of marginal beliefs, $\beta=(\beta_i^j|i,j=1,\cdots,N)$, also correspondent to a fixed point of the iterative message-passing algorithm of the Belief Propagation (BP) type. Our main result is an explicit expression of the exact partition function (permanent) in terms of the matrix of BP marginals, $\beta$, as $Z=\mbox{Perm}(P)=Z_{BP} \mbox{Perm}(\beta_i^j(1-\beta_i^j))/\prod_{i,j}(1-\beta_i^j)$, where $Z_{BP}$ is the BP expression for the permanent stated explicitly in terms if $\beta$. We give two derivations of the formula, a direct one based on the Bethe Free Energy and an alternative one combining the Ihara graph-$\zeta$ function and the Loop Calculus approaches. Assuming that the matrix $\beta$ of the Belief Propagation marginals is calculated, we provide two lower bounds and one upper-bound to estimate the multiplicative term. Two complementary lower bounds are based on the Gurvits-van der Waerden theorem and on a relation between the modified permanent and determinant respectively.
We propose a new approach to the analysis of Loopy Belief Propagation (LBP) by establishing a formula that connects the Hessian of the Bethe free energy with the edge zeta function. The formula has a number of theoretical implications on LBP. It is applied to give a sufficient condition that the Hessian of the Bethe free energy is positive definite, which shows non-convexity for graphs with multiple cycles. The formula clarifies the relation between the local stability of a fixed point of LBP and local minima of the Bethe free energy. We also propose a new approach to the uniqueness of LBP fixed point, and show various conditions of uniqueness.
The Bethe approximation, or loopy belief propagation algorithm is a successful method for approximating partition functions of probabilistic models associated with a graph. Chertkov and Chernyak derived an interesting formula called Loop Series Expansion, which is an expansion of the partition function. The main term of the series is the Bethe approximation while other terms are labeled by subgraphs called generalized loops. In our recent paper, we derive the loop series expansion in form of a polynomial with coefficients positive integers, and extend the result to the expansion of marginals. In this paper, we give more clear derivation for the results and discuss the properties of the polynomial which is introduced in the paper.