Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

PVS Embeddings of Propositional and Quantified Modal Logic

May 12, 2022
John Rushby

Modal logics allow reasoning about various modes of truth: for example, what it means for something to be possibly true, or to know that something is true as opposed to merely believing it. This report describes embeddings of propositional and quantified modal logic in the PVS verification system. The resources of PVS allow this to be done in an attractive way that supports much of the standard syntax of modal logic, while providing effective automation. The report introduces and formally specifies and verifies several standard topics in modal logic such as relationships between the standard modal axioms and properties of the accessibility relation, and attributes of the Barcan Formula and its converse in both constant and varying domains.

  Access Paper or Ask Questions

Quantifying Reproducibility in NLP and ML

Sep 02, 2021
Anya Belz

Reproducibility has become an intensely debated topic in NLP and ML over recent years, but no commonly accepted way of assessing reproducibility, let alone quantifying it, has so far emerged. The assumption has been that wider scientific reproducibility terminology and definitions are not applicable to NLP/ML, with the result that many different terms and definitions have been proposed, some diametrically opposed. In this paper, we test this assumption, by taking the standard terminology and definitions from metrology and applying them directly to NLP/ML. We find that we are able to straightforwardly derive a practical framework for assessing reproducibility which has the desirable property of yielding a quantified degree of reproducibility that is comparable across different reproduction studies.

  Access Paper or Ask Questions

Stack Index Prediction Using Time-Series Analysis

Aug 18, 2021
Raja CSP Raman, Rohith Mahadevan, Divya Perumal, Vedha Sankar, Talha Abdur Rahman

The Prevalence of Community support and engagement for different domains in the tech industry has changed and evolved throughout the years. In this study, we aim to understand, analyze and predict the trends of technology in a scientific manner, having collected data on numerous topics and their growth throughout the years in the past decade. We apply machine learning models on collected data, to understand, analyze and forecast the trends in the advancement of different fields. We show that certain technical concepts such as python, machine learning, and Keras have an undisputed uptrend, finally concluding that the Stackindex model forecasts with high accuracy and can be a viable tool for forecasting different tech domains.

* 10 pages, 9 figures 

  Access Paper or Ask Questions

Cognitively Aided Zero-Shot Automatic Essay Grading

Feb 22, 2021
Sandeep Mathias, Rudra Murthy, Diptesh Kanojia, Pushpak Bhattacharyya

Automatic essay grading (AEG) is a process in which machines assign a grade to an essay written in response to a topic, called the prompt. Zero-shot AEG is when we train a system to grade essays written to a new prompt which was not present in our training data. In this paper, we describe a solution to the problem of zero-shot automatic essay grading, using cognitive information, in the form of gaze behaviour. Our experiments show that using gaze behaviour helps in improving the performance of AEG systems, especially when we provide a new essay written in response to a new prompt for scoring, by an average of almost 5 percentage points of QWK.

* This paper was accepted for publication at ICON 2020: The 17th International Conference on Natural Language Processing, on December 20, 2021 

  Access Paper or Ask Questions

Generating Wikipedia Article Sections from Diverse Data Sources

Dec 29, 2020
Mingda Chen, Sam Wiseman, Kevin Gimpel

Datasets for data-to-text generation typically focus either on multi-domain, single-sentence generation or on single-domain, long-form generation. In this work, we create a large-scale dataset, WikiTableT, that pairs Wikipedia sections with their corresponding tabular data and various metadata. WikiTableT contains millions of instances, covering a broad range of topics, as well as a variety of flavors of generation tasks with different levels of flexibility. We benchmark several training and decoding strategies on WikiTableT. Our qualitative analysis shows that the best approaches can generate fluent and high quality texts but they sometimes struggle with coherence.

  Access Paper or Ask Questions

A Survey on Computational Propaganda Detection

Jul 15, 2020
Giovanni Da San Martino, Stefano Cresci, Alberto Barron-Cedeno, Seunghak Yu, Roberto Di Pietro, Preslav Nakov

Propaganda campaigns aim at influencing people's mindset with the purpose of advancing a specific agenda. They exploit the anonymity of the Internet, the micro-profiling ability of social networks, and the ease of automatically creating and managing coordinated networks of accounts, to reach millions of social network users with persuasive messages, specifically targeted to topics each individual user is sensitive to, and ultimately influencing the outcome on a targeted issue. In this survey, we review the state of the art on computational propaganda detection from the perspective of Natural Language Processing and Network Analysis, arguing about the need for combined efforts between these communities. We further discuss current challenges and future research directions.

* IJCAI-2020 
* propaganda detection, disinformation, misinformation, fake news, media bias 

  Access Paper or Ask Questions

Large-scale text processing pipeline with Apache Spark

Dec 02, 2019
Alexey Svyatkovskiy, Kosuke Imai, Mary Kroeger, Yuki Shiraito

In this paper, we evaluate Apache Spark for a data-intensive machine learning problem. Our use case focuses on policy diffusion detection across the state legislatures in the United States over time. Previous work on policy diffusion has been unable to make an all-pairs comparison between bills due to computational intensity. As a substitute, scholars have studied single topic areas. We provide an implementation of this analysis workflow as a distributed text processing pipeline with Spark dataframes and Scala application programming interface. We discuss the challenges and strategies of unstructured data processing, data formats for storage and efficient access, and graph processing at scale.

* Published in Proceedings of Big NLP workshop at the IEEE Big Data Conference 2016 

  Access Paper or Ask Questions

Statistical Model Aggregation via Parameter Matching

Nov 01, 2019
Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Trong Nghia Hoang

We consider the problem of aggregating models learned from sequestered, possibly heterogeneous datasets. Exploiting tools from Bayesian nonparametrics, we develop a general meta-modeling framework that learns shared global latent structures by identifying correspondences among local model parameterizations. Our proposed framework is model-independent and is applicable to a wide range of model types. After verifying our approach on simulated data, we demonstrate its utility in aggregating Gaussian topic models, hierarchical Dirichlet process based hidden Markov models, and sparse Gaussian processes with applications spanning text summarization, motion capture analysis, and temperature forecasting.

* NeurIPS 2019 

  Access Paper or Ask Questions

Improved Super-Resolution Convolution Neural Network for Large Images

Jul 26, 2019
Junyu, Wang, Rong Song

Single image super-resolution (SISR) is a very popular topic nowadays, which has both research value and practical value. In daily life, we crop a large image into sub-images to do super-resolution and then merge them together. Although convolution neural network performs very well in the research field, if we use it to do super-resolution, we can easily observe cutting lines from merged pictures. To address these problems, in this paper, we propose a refined architecture of SRCNN with 'Symmetric padding', 'Random learning' and 'Residual learning'. Moreover, we have done a lot of experiments to prove our model performs best among a lot of the state-of-art methods.

  Access Paper or Ask Questions

A Brief Introduction to Machine Learning for Engineers

May 17, 2018
Osvaldo Simeone

This monograph aims at providing an introduction to key concepts, algorithms, and theoretical results in machine learning. The treatment concentrates on probabilistic models for supervised and unsupervised learning problems. It introduces fundamental concepts and algorithms by building on first principles, while also exposing the reader to more advanced topics with extensive pointers to the literature, within a unified notation and mathematical framework. The material is organized according to clearly defined categories, such as discriminative and generative models, frequentist and Bayesian approaches, exact and approximate inference, as well as directed and undirected models. This monograph is meant as an entry point for researchers with a background in probability and linear algebra.

* This is an expanded and improved version of the original posting. Feedback is welcome 

  Access Paper or Ask Questions