Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Exact marginal inference in Latent Dirichlet Allocation

Mar 31, 2020
Hartmut Maennel

Assume we have potential "causes" $z\in Z$, which produce "events" $w$ with known probabilities $\beta(w|z)$. We observe $w_1,w_2,...,w_n$, what can we say about the distribution of the causes? A Bayesian estimate will assume a prior on distributions on $Z$ (we assume a Dirichlet prior) and calculate a posterior. An average over that posterior then gives a distribution on $Z$, which estimates how much each cause $z$ contributed to our observations. This is the setting of Latent Dirichlet Allocation, which can be applied e.g. to topics "producing" words in a document. In this setting usually the number of observed words is large, but the number of potential topics is small. We are here interested in applications with many potential "causes" (e.g. locations on the globe), but only a few observations. We show that the exact Bayesian estimate can be computed in linear time (and constant space) in $|Z|$ for a given upper bound on $n$ with a surprisingly simple formula. We generalize this algorithm to the case of sparse probabilities $\beta(w|z)$, in which we only need to assume that the tree width of an "interaction graph" on the observations is limited. On the other hand we also show that without such limitation the problem is NP-hard.

  Access Paper or Ask Questions

Adversarial Security Attacks and Perturbations on Machine Learning and Deep Learning Methods

Jul 17, 2019
Arif Siddiqi

The ever-growing big data and emerging artificial intelligence (AI) demand the use of machine learning (ML) and deep learning (DL) methods. Cybersecurity also benefits from ML and DL methods for various types of applications. These methods however are susceptible to security attacks. The adversaries can exploit the training and testing data of the learning models or can explore the workings of those models for launching advanced future attacks. The topic of adversarial security attacks and perturbations within the ML and DL domains is a recent exploration and a great interest is expressed by the security researchers and practitioners. The literature covers different adversarial security attacks and perturbations on ML and DL methods and those have their own presentation styles and merits. A need to review and consolidate knowledge that is comprehending of this increasingly focused and growing topic of research; however, is the current demand of the research communities. In this review paper, we specifically aim to target new researchers in the cybersecurity domain who may seek to acquire some basic knowledge on the machine learning and deep learning models and algorithms, as well as some of the relevant adversarial security attacks and perturbations.

  Access Paper or Ask Questions

Entity Commonsense Representation for Neural Abstractive Summarization

Jun 14, 2018
Reinald Kim Amplayo, Seonjae Lim, Seung-won Hwang

A major proportion of a text summary includes important entities found in the original text. These entities build up the topic of the summary. Moreover, they hold commonsense information once they are linked to a knowledge base. Based on these observations, this paper investigates the usage of linked entities to guide the decoder of a neural text summarizer to generate concise and better summaries. To this end, we leverage on an off-the-shelf entity linking system (ELS) to extract linked entities and propose Entity2Topic (E2T), a module easily attachable to a sequence-to-sequence model that transforms a list of entities into a vector representation of the topic of the summary. Current available ELS's are still not sufficiently effective, possibly introducing unresolved ambiguities and irrelevant entities. We resolve the imperfections of the ELS by (a) encoding entities with selective disambiguation, and (b) pooling entity vectors using firm attention. By applying E2T to a simple sequence-to-sequence model with attention mechanism as base model, we see significant improvements of the performance in the Gigaword (sentence to title) and CNN (long document to multi-sentence highlights) summarization datasets by at least 2 ROUGE points.

* Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) 
* NAACL 2018 

  Access Paper or Ask Questions

Variational Image Segmentation Model Coupled with Image Restoration Achievements

May 09, 2014
Xiaohao Cai

Image segmentation and image restoration are two important topics in image processing with great achievements. In this paper, we propose a new multiphase segmentation model by combining image restoration and image segmentation models. Utilizing image restoration aspects, the proposed segmentation model can effectively and robustly tackle high noisy images, blurry images, images with missing pixels, and vector-valued images. In particular, one of the most important segmentation models, the piecewise constant Mumford-Shah model, can be extended easily in this way to segment gray and vector-valued images corrupted for example by noise, blur or missing pixels after coupling a new data fidelity term which comes from image restoration topics. It can be solved efficiently using the alternating minimization algorithm, and we prove the convergence of this algorithm with three variables under mild condition. Experiments on many synthetic and real-world images demonstrate that our method gives better segmentation results in comparison to others state-of-the-art segmentation models especially for blurry images and images with missing pixels values.

* 23 pages 

  Access Paper or Ask Questions

Quantum Control Experiments as a Testbed for Evolutionary Multi-Objective Algorithms

Dec 22, 2011
Ofer M. Shir, Jonathan Roslund, Zaki Leghtas, Herschel Rabitz

Experimental multi-objective Quantum Control is an emerging topic within the broad physics and chemistry applications domain of controlling quantum phenomena. This realm offers cutting edge ultrafast laser laboratory applications, which pose multiple objectives, noise, and possibly constraints on the high-dimensional search. In this study we introduce the topic of Multi-Observable Quantum Control (MOQC), and consider specific systems to be Pareto optimized subject to uncertainty, either experimentally or by means of simulated systems. The latter include a family of mathematical test-functions with a practical link to MOQC experiments, which are introduced here for the first time. We investigate the behavior of the multi-objective version of the Covariance Matrix Adaptation Evolution Strategy (MO-CMA-ES) and assess its performance on computer simulations as well as on laboratory closed-loop experiments. Overall, we propose a comprehensive study on experimental evolutionary Pareto optimization in high-dimensional continuous domains, draw some practical conclusions concerning the impact of fitness disturbance on algorithmic behavior, and raise several theoretical issues in the broad evolutionary multi-objective context.

  Access Paper or Ask Questions

Learning by Teaching, with Application to Neural Architecture Search

Mar 11, 2021
Parth Sheth, Yueyu Jiang, Pengtao Xie

In human learning, an effective skill in improving learning outcomes is learning by teaching: a learner deepens his/her understanding of a topic by teaching this topic to others. In this paper, we aim to borrow this teaching-driven learning methodology from humans and leverage it to train more performant machine learning models, by proposing a novel ML framework referred to as learning by teaching (LBT). In the LBT framework, a teacher model improves itself by teaching a student model to learn well. Specifically, the teacher creates a pseudo-labeled dataset and uses it to train a student model. Based on how the student performs on a validation dataset, the teacher re-learns its model and re-teaches the student until the student achieves great validation performance. Our framework is based on three-level optimization which contains three stages: teacher learns; teacher teaches student; teacher re-learns based on how well the student performs. A simple but efficient algorithm is developed to solve the three-level optimization problem. We apply LBT to search neural architectures on CIFAR-10, CIFAR-100, and ImageNet. The efficacy of our method is demonstrated in various experiments.

  Access Paper or Ask Questions

Measuring the Novelty of Natural Language Text Using the Conjunctive Clauses of a Tsetlin Machine Text Classifier

Nov 17, 2020
Bimal Bhattarai, Ole-Christoffer Granmo, Lei Jiao

Most supervised text classification approaches assume a closed world, counting on all classes being present in the data at training time. This assumption can lead to unpredictable behaviour during operation, whenever novel, previously unseen, classes appear. Although deep learning-based methods have recently been used for novelty detection, they are challenging to interpret due to their black-box nature. This paper addresses \emph{interpretable} open-world text classification, where the trained classifier must deal with novel classes during operation. To this end, we extend the recently introduced Tsetlin machine (TM) with a novelty scoring mechanism. The mechanism uses the conjunctive clauses of the TM to measure to what degree a text matches the classes covered by the training data. We demonstrate that the clauses provide a succinct interpretable description of known topics, and that our scoring mechanism makes it possible to discern novel topics from the known ones. Empirically, our TM-based approach outperforms seven other novelty detection schemes on three out of five datasets, and performs second and third best on the remaining, with the added benefit of an interpretable propositional logic-based representation.

* 10 pages, 5 figures, 3 tables 

  Access Paper or Ask Questions

On-Device Machine Learning: An Algorithms and Learning Theory Perspective

Nov 02, 2019
Sauptik Dhar, Junyao Guo, Jiayi Liu, Samarth Tripathi, Unmesh Kurup, Mohak Shah

The current paradigm for using machine learning models on a device is to train a model in the cloud and perform inference using the trained model on the device. However, with the increasing number of smart devices and improved hardware, there is interest in performing model training on the device. Given this surge in interest, a comprehensive survey of the field from a device-agnostic perspective sets the stage for both understanding the state-of-the-art and for identifying open challenges and future avenues of research. Since on-device learning is an expansive field with connections to a large number of related topics in AI and machine learning (including online learning, model adaptation, one/few-shot learning, etc), covering such a large number of topics in a single survey is impractical. Instead, this survey finds a middle ground by reformulating the problem of on-device learning as resource constrained learning where the resources are compute and memory. This reformulation allows tools, techniques, and algorithms from a wide variety of research areas to be compared equitably. In addition to summarizing the state of the art, the survey also identifies a number of challenges and next steps for both the algorithmic and theoretical aspects of on-device learning.

* Edge Learning, Resource Constrained Machine Learning, 36 pages survey 

  Access Paper or Ask Questions

Generating a Common Question from Multiple Documents using Multi-source Encoder-Decoder Models

Oct 25, 2019
Woon Sang Cho, Yizhe Zhang, Sudha Rao, Chris Brockett, Sungjin Lee

Ambiguous user queries in search engines result in the retrieval of documents that often span multiple topics. One potential solution is for the search engine to generate multiple refined queries, each of which relates to a subset of the documents spanning the same topic. A preliminary step towards this goal is to generate a question that captures common concepts of multiple documents. We propose a new task of generating common question from multiple documents and present simple variant of an existing multi-source encoder-decoder framework, called the Multi-Source Question Generator (MSQG). We first train an RNN-based single encoder-decoder generator from (single document, question) pairs. At test time, given multiple documents, the 'Distribute' step of our MSQG model predicts target word distributions for each document using the trained model. The 'Aggregate' step aggregates these distributions to generate a common question. This simple yet effective strategy significantly outperforms several existing baseline models applied to the new task when evaluated using automated metrics and human judgments on the MS-MARCO-QA dataset.

* Accepted at EMNLP-IJCNLP 2019 - The 3rd Workshop on Neural Generation and Translation 

  Access Paper or Ask Questions

Emotion recognition with 4kresolution database

Oct 24, 2019
Qian Zheng

Classifying the human emotion through facial expressions is a big topic in both the Computer Vision and Deep learning fields. Human emotion can be classified as one of the basic emotion types like being angry, happy or dimensional emotion with valence and arousal values. There are a lot of related challenges in this topic, one of the most famous challenges is called the 'Affect-in-the-wild Challenge'(Aff-Wild Challenge). It is the first challenge on the estimation of valence and arousal in-the-wild. This project is an extension of the Aff-wild Challenge. Aff-wild database was created using images with a mean resolution of 607*359, I and Dimitrios sought to find out the performance of the model that is trained on a database that contains4K resolution in-the-wild images. Since there is no existing database to satisfy the requirement, I built this database from scratch with help from Dimitrios and trained neural network models with different hyperparameters on this database. I used network models likeVGG16, AlexNet, ResNet and also some pre-trained models like Ima-geNet VGG. I compared the results of the different network models alongside the results from the Aff-wild database to exploit the optimal model for my database.

  Access Paper or Ask Questions