Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Simplified Minimal Gated Unit Variations for Recurrent Neural Networks

Jan 12, 2017
Joel Heck, Fathi M. Salem

Recurrent neural networks with various types of hidden units have been used to solve a diverse range of problems involving sequence data. Two of the most recent proposals, gated recurrent units (GRU) and minimal gated units (MGU), have shown comparable promising results on example public datasets. In this paper, we introduce three model variants of the minimal gated unit (MGU) which further simplify that design by reducing the number of parameters in the forget-gate dynamic equation. These three model variants, referred to simply as MGU1, MGU2, and MGU3, were tested on sequences generated from the MNIST dataset and from the Reuters Newswire Topics (RNT) dataset. The new models have shown similar accuracy to the MGU model while using fewer parameters and thus lowering training expense. One model variant, namely MGU2, performed better than MGU on the datasets considered, and thus may be used as an alternate to MGU or GRU in recurrent neural networks.

* 5 pages, 3 Figures, 5 Tables 

  Access Paper or Ask Questions

Coherent Dialogue with Attention-based Language Models

Nov 21, 2016
Hongyuan Mei, Mohit Bansal, Matthew R. Walter

We model coherent conversation continuation via RNN-based dialogue models equipped with a dynamic attention mechanism. Our attention-RNN language model dynamically increases the scope of attention on the history as the conversation continues, as opposed to standard attention (or alignment) models with a fixed input scope in a sequence-to-sequence model. This allows each generated word to be associated with the most relevant words in its corresponding conversation history. We evaluate the model on two popular dialogue datasets, the open-domain MovieTriples dataset and the closed-domain Ubuntu Troubleshoot dataset, and achieve significant improvements over the state-of-the-art and baselines on several metrics, including complementary diversity-based metrics, human evaluation, and qualitative visualizations. We also show that a vanilla RNN with dynamic attention outperforms more complex memory models (e.g., LSTM and GRU) by allowing for flexible, long-distance memory. We promote further coherence via topic modeling-based reranking.

* To appear at AAAI 2017 

  Access Paper or Ask Questions

Towards A Deeper Geometric, Analytic and Algorithmic Understanding of Margins

Jan 29, 2016
Aaditya Ramdas, Javier Peña

Given a matrix $A$, a linear feasibility problem (of which linear classification is a special case) aims to find a solution to a primal problem $w: A^Tw > \textbf{0}$ or a certificate for the dual problem which is a probability distribution $p: Ap = \textbf{0}$. Inspired by the continued importance of "large-margin classifiers" in machine learning, this paper studies a condition measure of $A$ called its \textit{margin} that determines the difficulty of both the above problems. To aid geometrical intuition, we first establish new characterizations of the margin in terms of relevant balls, cones and hulls. Our second contribution is analytical, where we present generalizations of Gordan's theorem, and variants of Hoffman's theorems, both using margins. We end by proving some new results on a classical iterative scheme, the Perceptron, whose convergence rates famously depends on the margin. Our results are relevant for a deeper understanding of margin-based learning and proving convergence rates of iterative schemes, apart from providing a unifying perspective on this vast topic.

* Optimization Methods and Software, Volume 31, Issue 2, Pages 377-391, 2016 
* 18 pages, 3 figures 

  Access Paper or Ask Questions

Compressed Counting Meets Compressed Sensing

Oct 03, 2013
Ping Li, Cun-Hui Zhang, Tong Zhang

Compressed sensing (sparse signal recovery) has been a popular and important research topic in recent years. By observing that natural signals are often nonnegative, we propose a new framework for nonnegative signal recovery using Compressed Counting (CC). CC is a technique built on maximally-skewed p-stable random projections originally developed for data stream computations. Our recovery procedure is computationally very efficient in that it requires only one linear scan of the coordinates. Our analysis demonstrates that, when 0

0 and C=pi/2 when p=0.5. In particular, when p->0 the required number of measurements is essentially M=K\log N, where K is the number of nonzero coordinates of the signal.

  Access Paper or Ask Questions

Image-based Face Detection and Recognition: "State of the Art"

Feb 26, 2013
Faizan Ahmad, Aaima Najam, Zeeshan Ahmed

Face recognition from image or video is a popular topic in biometrics research. Many public places usually have surveillance cameras for video capture and these cameras have their significant value for security purpose. It is widely acknowledged that the face recognition have played an important role in surveillance system as it doesn't need the object's cooperation. The actual advantages of face based identification over other biometrics are uniqueness and acceptance. As human face is a dynamic object having high degree of variability in its appearance, that makes face detection a difficult problem in computer vision. In this field, accuracy and speed of identification is a main issue. The goal of this paper is to evaluate various face detection and recognition methods, provide complete solution for image based face detection and recognition with higher accuracy, better response rate as an initial step for video surveillance. Solution is proposed based on performed tests on various face rich databases in terms of subjects, pose, emotions, race and light.

* IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 6, No 1, November 2012 ISSN (Online): 1694-0814 
* 4 pages, 3 table, 4 figure 

  Access Paper or Ask Questions

Anomaly Sequences Detection from Logs Based on Compression

Sep 08, 2011
Nan Wang, Jizhong Han, Jinyun Fang

Mining information from logs is an old and still active research topic. In recent years, with the rapid emerging of cloud computing, log mining becomes increasingly important to industry. This paper focus on one major mission of log mining: anomaly detection, and proposes a novel method for mining abnormal sequences from large logs. Different from previous anomaly detection systems which based on statistics, probabilities and Markov assumption, our approach measures the strangeness of a sequence using compression. It first trains a grammar about normal behaviors using grammar-based compression, then measures the information quantities and densities of questionable sequences according to incrementation of grammar length. We have applied our approach on mining some real bugs from fine grained execution logs. We have also tested its ability on intrusion detection using some publicity available system call traces. The experiments show that our method successfully selects the strange sequences which related to bugs or attacking.

* 7 pages, 5 figures, 6 tables 

  Access Paper or Ask Questions

Mixed Initiative in Dialogue: An Investigation into Discourse Segmentation

Apr 05, 1995
Marilyn Walker, Steve Whittaker

Conversation between two people is usually of mixed-initiative, with control over the conversation being transferred from one person to another. We apply a set of rules for the transfer of control to 4 sets of dialogues consisting of a total of 1862 turns. The application of the control rules lets us derive domain-independent discourse structures. The derived structures indicate that initiative plays a role in the structuring of discourse. In order to explore the relationship of control and initiative to discourse processes like centering, we analyze the distribution of four different classes of anaphora for two data sets. This distribution indicates that some control segments are hierarchically related to others. The analysis suggests that discourse participants often mutually agree to a change of topic. We also compared initiative in Task Oriented and Advice Giving dialogues and found that both allocation of control and the manner in which control is transferred is radically different for the two dialogue types. These differences can be explained in terms of collaborative planning principles.

* Proceedings of the 28th Annual Meeting of the Association of Computational Linguistics, 1990 
* 8 pages, latex 

  Access Paper or Ask Questions

Explainable Deep Learning Methods in Medical Diagnosis: A Survey

May 10, 2022
Cristiano Patrício, João C. Neves, Luís F. Teixeira

The remarkable success of deep learning has prompted interest in its application to medical diagnosis. Even tough state-of-the-art deep learning models have achieved human-level accuracy on the classification of different types of medical data, these models are hardly adopted in clinical workflows, mainly due to their lack of interpretability. The black-box-ness of deep learning models has raised the need for devising strategies to explain the decision process of these models, leading to the creation of the topic of eXplainable Artificial Intelligence (XAI). In this context, we provide a thorough survey of XAI applied to medical diagnosis, including visual, textual, and example-based explanation methods. Moreover, this work reviews the existing medical imaging datasets and the existing metrics for evaluating the quality of the explanations . Complementary to most existing surveys, we include a performance comparison among a set of report generation-based methods. Finally, the major challenges in applying XAI to medical imaging are also discussed.

  Access Paper or Ask Questions

Toward A Fine-Grained Analysis of Distribution Shifts in MSMARCO

May 05, 2022
Simon Lupart, Stéphane Clinchant

Recent IR approaches based on Pretrained Language Models (PLM) have now largely outperformed their predecessors on a variety of IR tasks. However, what happens to learned word representations with distribution shifts remains unclear. Recently, the BEIR benchmark was introduced to assess the performance of neural rankers in zero-shot settings and revealed deficiencies for several models. In complement to BEIR, we propose to control \textit{explicitly} distribution shifts. We selected different query subsets leading to different distribution shifts: short versus long queries, wh-words types of queries and 5 topic-based clusters. Then, we benchmarked state of the art neural rankers such as dense Bi-Encoder, SPLADE and ColBERT under these different training and test conditions. Our study demonstrates that it is possible to design distribution shift experiments within the MSMARCO collection, and that the query subsets we selected constitute an additional benchmark to better study factors of generalization for various models.

  Access Paper or Ask Questions

Coevolutionary Pareto Diversity Optimization

Apr 12, 2022
Aneta Neumann, Denis Antipov, Frank Neumann

Computing diverse sets of high quality solutions for a given optimization problem has become an important topic in recent years. In this paper, we introduce a coevolutionary Pareto Diversity Optimization approach which builds on the success of reformulating a constrained single-objective optimization problem as a bi-objective problem by turning the constraint into an additional objective. Our new Pareto Diversity optimization approach uses this bi-objective formulation to optimize the problem while also maintaining an additional population of high quality solutions for which diversity is optimized with respect to a given diversity measure. We show that our standard co-evolutionary Pareto Diversity Optimization approach outperforms the recently introduced DIVEA algorithm which obtains its initial population by generalized diversifying greedy sampling and improving the diversity of the set of solutions afterwards. Furthermore, we study possible improvements of the Pareto Diversity Optimization approach. In particular, we show that the use of inter-population crossover further improves the diversity of the set of solutions.

  Access Paper or Ask Questions