Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Internal Model from Observations for Reward Shaping

Oct 14, 2018
Daiki Kimura, Subhajit Chaudhury, Ryuki Tachibana, Sakyasingha Dasgupta

Reinforcement learning methods require careful design involving a reward function to obtain the desired action policy for a given task. In the absence of hand-crafted reward functions, prior work on the topic has proposed several methods for reward estimation by using expert state trajectories and action pairs. However, there are cases where complete or good action information cannot be obtained from expert demonstrations. We propose a novel reinforcement learning method in which the agent learns an internal model of observation on the basis of expert-demonstrated state trajectories to estimate rewards without completely learning the dynamics of the external environment from state-action pairs. The internal model is obtained in the form of a predictive model for the given expert state distribution. During reinforcement learning, the agent predicts the reward as a function of the difference between the actual state and the state predicted by the internal model. We conducted multiple experiments in environments of varying complexity, including the Super Mario Bros and Flappy Bird games. We show our method successfully trains good policies directly from expert game-play videos.

* 7 pages, 6 figures, ICML workshop (ALA 2018) 

  Access Paper or Ask Questions

A Survey on Deep Learning Methods for Robot Vision

Mar 28, 2018
Javier Ruiz-del-Solar, Patricio Loncomilla, Naiomi Soto

Deep learning has allowed a paradigm shift in pattern recognition, from using hand-crafted features together with statistical classifiers to using general-purpose learning procedures for learning data-driven representations, features, and classifiers together. The application of this new paradigm has been particularly successful in computer vision, in which the development of deep learning methods for vision applications has become a hot research topic. Given that deep learning has already attracted the attention of the robot vision community, the main purpose of this survey is to address the use of deep learning in robot vision. To achieve this, a comprehensive overview of deep learning and its usage in computer vision is given, that includes a description of the most frequently used neural models and their main application areas. Then, the standard methodology and tools used for designing deep-learning based vision systems are presented. Afterwards, a review of the principal work using deep learning in robot vision is presented, as well as current and future trends related to the use of deep learning in robotics. This survey is intended to be a guide for the developers of robot vision systems.

  Access Paper or Ask Questions

Linear-Time Sequence Classification using Restricted Boltzmann Machines

Mar 08, 2018
Son N. Tran, Srikanth Cherla, Artur Garcez, Tillman Weyde

Classification of sequence data is the topic of interest for dynamic Bayesian models and Recurrent Neural Networks (RNNs). While the former can explicitly model the temporal dependencies between class variables, the latter have a capability of learning representations. Several attempts have been made to improve performance by combining these two approaches or increasing the processing capability of the hidden units in RNNs. This often results in complex models with a large number of learning parameters. In this paper, a compact model is proposed which offers both representation learning and temporal inference of class variables by rolling Restricted Boltzmann Machines (RBMs) and class variables over time. We address the key issue of intractability in this variant of RBMs by optimising a conditional distribution, instead of a joint distribution. Experiments reported in the paper on melody modelling and optical character recognition show that the proposed model can outperform the state-of-the-art. Also, the experimental results on optical character recognition, part-of-speech tagging and text chunking demonstrate that our model is comparable to recurrent neural networks with complex memory gates while requiring far fewer parameters.

  Access Paper or Ask Questions

Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations

Jul 07, 2017
Alex Beutel, Jilin Chen, Zhe Zhao, Ed H. Chi

How can we learn a classifier that is "fair" for a protected or sensitive group, when we do not know if the input to the classifier belongs to the protected group? How can we train such a classifier when data on the protected group is difficult to attain? In many settings, finding out the sensitive input attribute can be prohibitively expensive even during model training, and sometimes impossible during model serving. For example, in recommender systems, if we want to predict if a user will click on a given recommendation, we often do not know many attributes of the user, e.g., race or age, and many attributes of the content are hard to determine, e.g., the language or topic. Thus, it is not feasible to use a different classifier calibrated based on knowledge of the sensitive attribute. Here, we use an adversarial training procedure to remove information about the sensitive attribute from the latent representation learned by a neural network. In particular, we study how the choice of data for the adversarial training effects the resulting fairness properties. We find two interesting results: a small amount of data is needed to train these adversarial models, and the data distribution empirically drives the adversary's notion of fairness.

* Presented as a poster at the 2017 Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML 2017) 

  Access Paper or Ask Questions

Beyond knowing that: a new generation of epistemic logics

Nov 23, 2016
Yanjing Wang

Epistemic logic has become a major field of philosophical logic ever since the groundbreaking work by Hintikka (1962). Despite its various successful applications in theoretical computer science, AI, and game theory, the technical development of the field has been mainly focusing on the propositional part, i.e., the propositional modal logics of "knowing that". However, knowledge is expressed in everyday life by using various other locutions such as "knowing whether", "knowing what", "knowing how" and so on (knowing-wh hereafter). Such knowledge expressions are better captured in quantified epistemic logic, as was already discussed by Hintikka (1962) and his sequel works at length. This paper aims to draw the attention back again to such a fascinating but largely neglected topic. We first survey what Hintikka and others did in the literature of quantified epistemic logic, and then advocate a new quantifier-free approach to study the epistemic logics of knowing-wh, which we believe can balance expressivity and complexity, and capture the essential reasoning patterns about knowing-wh. We survey our recent line of work on the epistemic logics of "knowing whether", "knowing what" and "knowing how" to demonstrate the use of this new approach.

* 36 pages, to appear in Jaakko Hintikka on knowledge and game theoretical semantics, Springer's Outstanding Contributions to Logic Series (some references are updated in this version) 

  Access Paper or Ask Questions

Towards Competitive Classifiers for Unbalanced Classification Problems: A Study on the Performance Scores

Aug 31, 2016
Jonathan Ortigosa-Hernández, Iñaki Inza, Jose A. Lozano

Although a great methodological effort has been invested in proposing competitive solutions to the class-imbalance problem, little effort has been made in pursuing a theoretical understanding of this matter. In order to shed some light on this topic, we perform, through a novel framework, an exhaustive analysis of the adequateness of the most commonly used performance scores to assess this complex scenario. We conclude that using unweighted H\"older means with exponent $p \leq 1$ to average the recalls of all the classes produces adequate scores which are capable of determining whether a classifier is competitive. Then, we review the major solutions presented in the class-imbalance literature. Since any learning task can be defined as an optimisation problem where a loss function, usually connected to a particular score, is minimised, our goal, here, is to find whether the learning tasks found in the literature are also oriented to maximise the previously detected adequate scores. We conclude that they usually maximise the unweighted H\"older mean with $p = 1$ (a-mean). Finally, we provide bounds on the values of the studied performance scores which guarantee a classifier with a higher recall than the random classifier in each and every class.

  Access Paper or Ask Questions

Nonparametric Modeling of Dynamic Functional Connectivity in fMRI Data

Jun 08, 2016
Søren F. V. Nielsen, Kristoffer H. Madsen, Rasmus Røge, Mikkel N. Schmidt, Morten Mørup

Dynamic functional connectivity (FC) has in recent years become a topic of interest in the neuroimaging community. Several models and methods exist for both functional magnetic resonance imaging (fMRI) and electroencephalography (EEG), and the results point towards the conclusion that FC exhibits dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a non-parametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted in Bayesian statistical modeling we use the predictive likelihood to investigate if the model can discriminate between a motor task and rest both within and across subjects. We further investigate what drives dynamic states using the model on the entire data collated across subjects and task/rest. We find that the number of states extracted are driven by subject variability and preprocessing differences while the individual states are almost purely defined by either task or rest. This questions how we in general interpret dynamic FC and points to the need for more research on what drives dynamic FC.

* 8 pages, 1 figure. Presented at the Machine Learning and Interpretation in Neuroimaging Workshop (MLINI-2015), 2015 (arXiv:1605.04435) 

  Access Paper or Ask Questions

WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation

Mar 02, 2016
Jianfei Chen, Kaiwei Li, Jun Zhu, Wenguang Chen

Developing efficient and scalable algorithms for Latent Dirichlet Allocation (LDA) is of wide interest for many applications. Previous work has developed an O(1) Metropolis-Hastings sampling method for each token. However, the performance is far from being optimal due to random accesses to the parameter matrices and frequent cache misses. In this paper, we first carefully analyze the memory access efficiency of existing algorithms for LDA by the scope of random access, which is the size of the memory region in which random accesses fall, within a short period of time. We then develop WarpLDA, an LDA sampler which achieves both the best O(1) time complexity per token and the best O(K) scope of random access. Our empirical results in a wide range of testing conditions demonstrate that WarpLDA is consistently 5-15x faster than the state-of-the-art Metropolis-Hastings based LightLDA, and is comparable or faster than the sparsity aware F+LDA. With WarpLDA, users can learn up to one million topics from hundreds of millions of documents in a few hours, at an unprecedentedly throughput of 11G tokens per second.

  Access Paper or Ask Questions

Towards a General Framework for Actual Causation Using CP-logic

Oct 29, 2015
Sander Beckers, Joost Vennekens

Since Pearl's seminal work on providing a formal language for causality, the subject has garnered a lot of interest among philosophers and researchers in artificial intelligence alike. One of the most debated topics in this context regards the notion of actual causation, which concerns itself with specific - as opposed to general - causal claims. The search for a proper formal definition of actual causation has evolved into a controversial debate, that is pervaded with ambiguities and confusion. The goal of our research is twofold. First, we wish to provide a clear way to compare competing definitions. Second, we also want to improve upon these definitions so they can be applied to a more diverse range of instances, including non-deterministic ones. To achieve these goals we will provide a general, abstract definition of actual causation, formulated in the context of the expressive language of CP-logic (Causal Probabilistic logic). We will then show that three recent definitions by Ned Hall (originally formulated for structural models) and a definition of our own (formulated for CP-logic directly) can be viewed and directly compared as instantiations of this abstract definition, which allows them to deal with a broader range of examples.


  Access Paper or Ask Questions

Towards Resolving Software Quality-in-Use Measurement Challenges

Jan 30, 2015
Issa Atoum, Chih How Bong, Narayanan Kulathuramaiyer

Software quality-in-use comprehends the quality from user's perspectives. It has gained its importance in e-learning applications, mobile service based applications and project management tools. User's decisions on software acquisitions are often ad hoc or based on preference due to difficulty in quantitatively measure software quality-in-use. However, why quality-in-use measurement is difficult? Although there are many software quality models to our knowledge, no works surveys the challenges related to software quality-in-use measurement. This paper has two main contributions; 1) presents major issues and challenges in measuring software quality-in-use in the context of the ISO SQuaRE series and related software quality models, 2) Presents a novel framework that can be used to predict software quality-in-use, and 3) presents preliminary results of quality-in-use topic prediction. Concisely, the issues are related to the complexity of the current standard models and the limitations and incompleteness of the customized software quality models. The proposed framework employs sentiment analysis techniques to predict software quality-in-use.

* 9 pages, 4 figures, Journal of Emerging Trends in Computing and Information Sciences, Vol. 5, No. 11, November 2014 

  Access Paper or Ask Questions