Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ruslan R Salakhutdinov

Modeling Documents with Deep Boltzmann Machines

Sep 26, 2013

Nitish Srivastava, Ruslan R Salakhutdinov, Geoffrey E. Hinton

Figure 1 for Modeling Documents with Deep Boltzmann Machines

Figure 2 for Modeling Documents with Deep Boltzmann Machines

Figure 3 for Modeling Documents with Deep Boltzmann Machines

Figure 4 for Modeling Documents with Deep Boltzmann Machines

Abstract:We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents. We overcome the apparent difficulty of training a DBM with judicious parameter tying. This parameter tying enables an efficient pretraining algorithm and a state initialization scheme that aids inference. The model can be trained just as efficiently as a standard Restricted Boltzmann Machine. Our experiments show that the model assigns better log probability to unseen data than the Replicated Softmax model. Features extracted from our model outperform LDA, Replicated Softmax, and DocNADE models on document retrieval and document classification tasks.

* Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

Via

Access Paper or Ask Questions

On the Convergence of Bound Optimization Algorithms

Oct 19, 2012

Ruslan R Salakhutdinov, Sam T Roweis, Zoubin Ghahramani

Figure 1 for On the Convergence of Bound Optimization Algorithms

Figure 2 for On the Convergence of Bound Optimization Algorithms

Figure 3 for On the Convergence of Bound Optimization Algorithms

Figure 4 for On the Convergence of Bound Optimization Algorithms

Abstract:Many practitioners who use the EM algorithm complain that it is sometimes slow. When does this happen, and what can be done about it? In this paper, we study the general class of bound optimization algorithms - including Expectation-Maximization, Iterative Scaling and CCCP - and their relationship to direct optimization algorithms such as gradient-based methods for parameter learning. We derive a general relationship between the updates performed by bound optimization methods and those of gradient and second-order methods and identify analytic conditions under which bound optimization algorithms exhibit quasi-Newton behavior, and conditions under which they possess poor, first-order convergence. Based on this analysis, we consider several specific algorithms, interpret and analyze their convergence properties and provide some recipes for preprocessing input to these algorithms to yield faster convergence behavior. We report empirical results supporting our analysis and showing that simple data preprocessing can result in dramatically improved performance of bound optimizers in practice.

* Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

Via

Access Paper or Ask Questions

Exploiting compositionality to explore a large space of model structures

Oct 16, 2012

Roger Grosse, Ruslan R Salakhutdinov, William T. Freeman, Joshua B. Tenenbaum

Figure 1 for Exploiting compositionality to explore a large space of model structures

Figure 2 for Exploiting compositionality to explore a large space of model structures

Figure 3 for Exploiting compositionality to explore a large space of model structures

Figure 4 for Exploiting compositionality to explore a large space of model structures

Abstract:The recent proliferation of richly structured probabilistic models raises the question of how to automatically determine an appropriate model for a dataset. We investigate this question for a space of matrix decomposition models which can express a variety of widely used models from unsupervised learning. To enable model selection, we organize these models into a context-free grammar which generates a wide variety of structures through the compositional application of a few simple rules. We use our grammar to generically and efficiently infer latent components and estimate predictive likelihood for nearly 2500 structures using a small toolbox of reusable algorithms. Using a greedy search over our grammar, we automatically choose the decomposition structure from raw data by evaluating only a small fraction of all models. The proposed method typically finds the correct structure for synthetic data and backs off gracefully to simpler models under heavy noise. It learns sensible structures for datasets as diverse as image patches, motion capture, 20 Questions, and U.S. Senate votes, all using exactly the same code.

* Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

Via

Access Paper or Ask Questions