Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Theory of Dependent Hierarchical Normalized Random Measures

May 25, 2012
Changyou Chen, Wray Buntine, Nan Ding

This paper presents theory for Normalized Random Measures (NRMs), Normalized Generalized Gammas (NGGs), a particular kind of NRM, and Dependent Hierarchical NRMs which allow networks of dependent NRMs to be analysed. These have been used, for instance, for time-dependent topic modelling. In this paper, we first introduce some mathematical background of completely random measures (CRMs) and their construction from Poisson processes, and then introduce NRMs and NGGs. Slice sampling is also introduced for posterior inference. The dependency operators in Poisson processes and for the corresponding CRMs and NRMs is then introduced and Posterior inference for the NGG presented. Finally, we give dependency and composition results when applying these operators to NRMs so they can be used in a network with hierarchical and dependent relations.

  Access Paper or Ask Questions

Quantum-Like Uncertain Conditionals for Text Analysis

Jun 02, 2011
Alvaro Francisco Huertas-Rosero, C. J. van Rijsbergen

Simple representations of documents based on the occurrences of terms are ubiquitous in areas like Information Retrieval, and also frequent in Natural Language Processing. In this work we propose a logical-probabilistic approach to the analysis of natural language text based in the concept of Uncertain Conditional, on top of a formulation of lexical measurements inspired in the theoretical concept of ideal quantum measurements. The proposed concept can be used for generating topic-specific representations of text, aiming to match in a simple way the perception of a user with a pre-established idea of what the usage of terms in the text should be. A simple example is developed with two versions of a text in two languages, showing how regularities in the use of terms are detected and easily represented.

* 11 pages, 2 figures. To be published in the proceedings of Quantum Interaction 2011 

  Access Paper or Ask Questions

Presenting Punctuation

Jun 10, 1995
Michael White

Until recently, punctuation has received very little attention in the linguistics and computational linguistics literature. Since the publication of Nunberg's (1990) monograph on the topic, however, punctuation has seen its stock begin to rise: spurred in part by Nunberg's ground-breaking work, a number of valuable inquiries have been subsequently undertaken, including Hovy and Arens (1991), Dale (1991), Pascual (1993), Jones (1994), and Briscoe (1994). Continuing this line of research, I investigate in this paper how Nunberg's approach to presenting punctuation (and other formatting devices) might be incorporated into NLG systems. Insofar as the present paper focuses on the proper syntactic treatment of punctuation, it differs from these other subsequent works in that it is the first to examine this issue from the generation perspective.

* In Proceedings of the Fifth European Workshop on Natural Language Generation, Leiden, The Netherlands, pp. 107--125. 
* compressed uuencoded PostScript, 19 pages; Word 6.0 doc available upon request from [email protected] 

  Access Paper or Ask Questions

Politeness Counts: Perceptions of Peacekeeping Robots

May 19, 2022
Ohad Inbar, Joachim Meyer

The 'intuitive' trust people feel when encountering robots in public spaces is a key determinant of their willingness to cooperate with these robots. We conducted four experiments to study this topic in the context of peacekeeping robots. Participants viewed scenarios, presented as static images or animations, involving a robot or a human guard performing an access-control task. The guards interacted more or less politely with younger and older male and female people. Our results show strong effects of the guard's politeness. Age and sex of the people interacting with the guard had no significant effect on participants' impressions of its attributes. There were no differences between responses to robot and human guards. This study advances the notion that politeness is a crucial determinant of people's perception of peacekeeping robots.

* IEEE Transactions on Human-Machine Systems, 49 (3), 232-240 (2019) 

  Access Paper or Ask Questions

Answering Count Queries with Explanatory Evidence

Apr 11, 2022
Shrestha Ghosh, Simon Razniewski, Gerhard Weikum

A challenging case in web search and question answering are count queries, such as \textit{"number of songs by John Lennon"}. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unlike previous systems, our method infers final answers from multiple observations, supports semantic qualifiers for the counts, and provides evidence by enumerating representative instances. Experiments with a wide variety of queries show the benefits of our method. To promote further research on this underexplored topic, we release an annotated dataset of 5k queries with 200k relevant text spans.

* Version accepted at SIGIR 2022 

  Access Paper or Ask Questions

An Algorithm for Generating Gap-Fill Multiple Choice Questions of an Expert System

Sep 17, 2021
Pornpat Sirithumgul, Pimpaka Prasertsilp, Lorne Olfman

This research is aimed to propose an artificial intelligence algorithm comprising an ontology-based design, text mining, and natural language processing for automatically generating gap-fill multiple choice questions (MCQs). The simulation of this research demonstrated an application of the algorithm in generating gap-fill MCQs about software testing. The simulation results revealed that by using 103 online documents as inputs, the algorithm could automatically produce more than 16 thousand valid gap-fill MCQs covering a variety of topics in the software testing domain. Finally, in the discussion section of this paper we suggest how the proposed algorithm should be applied to produce gap-fill MCQs being collected in a question pool used by a knowledge expert system.

  Access Paper or Ask Questions

A Weak Supervised Dataset of Fine-Grained Emotions in Portuguese

Aug 17, 2021
Diogo Cortiz, Jefferson O. Silva, Newton Calegari, Ana Luísa Freitas, Ana Angélica Soares, Carolina Botelho, Gabriel Gaudencio Rêgo, Waldir Sampaio, Paulo Sergio Boggio

Affective Computing is the study of how computers can recognize, interpret and simulate human affects. Sentiment Analysis is a common task in NLP related to this topic, but it focuses only on emotion valence (positive, negative, neutral). An emerging approach in NLP is Emotion Recognition, which relies on fined-grained classification. This research describes an approach to create a lexical-based weak supervised corpus for fine-grained emotion in Portuguese. We evaluate our dataset by fine-tuning a transformer-based language model (BERT) and validating it on a Golden Standard annotated validation set. Our results (F1-score= .64) suggest lexical-based weak supervision as an appropriate strategy for initial work in low resources environment.

  Access Paper or Ask Questions

An ontology for the formalization and visualization of scientific knowledge

Jul 09, 2021
Vincenzo Daponte, Gilles Falquet

The construction of an ontology of scientific knowledge objects, presented here, is part of the development of an approach oriented towards the visualization of scientific knowledge. It is motivated by the fact that the concepts of organization of scientific knowledge (theorem, law, experience, proof, etc.) appear in existing ontologies but that none of them is centered on this topic and presents a simple and easily usable organization. We present the first version built from ontological sources (ontologies of knowledge objects of certain fields, lexical and higher level ones), specialized knowledge bases and interviews with scientists. We have aligned this ontology with some of the sources used, which has allowed us to verify its consistency with respect to them. The validation of the ontology consists in using it to formalize knowledge from various sources, which we have begun to do in the field of physics.

  Access Paper or Ask Questions

Mitigating Political Bias in Language Models Through Reinforced Calibration

Apr 30, 2021
Ruibo Liu, Chenyan Jia, Jason Wei, Guangxuan Xu, Lili Wang, Soroush Vosoughi

Current large-scale language models can be politically biased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real-world settings. In this paper, we describe metrics for measuring political bias in GPT-2 generation and propose a reinforcement learning (RL) framework for mitigating political biases in generated text. By using rewards from word embeddings or a classifier, our RL framework guides debiased generation without having access to the training data or requiring the model to be retrained. In empirical experiments on three attributes sensitive to political bias (gender, location, and topic), our methods reduced bias according to both our metrics and human evaluation, while maintaining readability and semantic coherence.

* In proceedings of the 35th AAAI Conference on Artificial Intelligence 

  Access Paper or Ask Questions

Scaling up graph homomorphism for classification via sampling

Apr 08, 2021
Paul Beaujean, Florian Sikora, Florian Yger

Feature generation is an open topic of investigation in graph machine learning. In this paper, we study the use of graph homomorphism density features as a scalable alternative to homomorphism numbers which retain similar theoretical properties and ability to take into account inductive bias. For this, we propose a high-performance implementation of a simple sampling algorithm which computes additive approximations of homomorphism densities. In the context of graph machine learning, we demonstrate in experiments that simple linear models trained on sample homomorphism densities can achieve performance comparable to graph neural networks on standard graph classification datasets. Finally, we show in experiments on synthetic data that this algorithm scales to very large graphs when implemented with Bloom filters.

* 17 pages, 1 figure 

  Access Paper or Ask Questions