Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

GPoeT-2: A GPT-2 Based Poem Generator

May 18, 2022
Kai-Ling Lo, Rami Ariss, Philipp Kurz

This project aims to produce the next volume of machine-generated poetry, a complex art form that can be structured and unstructured, and carries depth in the meaning between the lines. GPoeT-2 is based on fine-tuning a state of the art natural language model (i.e. GPT-2) to generate limericks, typically humorous structured poems consisting of five lines with a AABBA rhyming scheme. With a two-stage generation system utilizing both forward and reverse language modeling, GPoeT-2 is capable of freely generating limericks in diverse topics while following the rhyming structure without any seed phrase or a posteriori constraints.Based on the automated generation process, we explore a wide variety of evaluation metrics to quantify "good poetry," including syntactical correctness, lexical diversity, and subject continuity. Finally, we present a collection of 94 categorized limericks that rank highly on the explored "good poetry" metrics to provoke human creativity.

* Carnegie Mellon University 11-785: Intro to Deep Learning Final Project 

  Access Paper or Ask Questions

Sampling with Attribute-Related Information for Controlling Language Models

May 12, 2022
Shangda Wu, Maosong Sun

The dominant approaches for controlling language models are based on fine-tuning large language models or prompt engineering. However, these methods often require condition-specific data or considerable hand-crafting. We propose a new simple guided decoding method, Gamma Sampling, which does not require complex engineering and any extra data. Gamma Sampling introduces attribute-related information (provided by humans or language models themselves) into the sampling process to guide language models to generate texts with desired attributes. Experiments on controlling topics and sentiments of generated text show Gamma Sampling to be superior in diversity, attribute relevance and overall quality of generated samples while maintaining a fast generation speed. In addition, we successfully applied Gamma Sampling to control other attributes of language such as relatedness and repetition, which further demonstrates the versatility and effectiveness of this method. Gamma Sampling is now available in the python package samplings via import gamma sampling from samplings.

* 21 pages, 4 figures 

  Access Paper or Ask Questions

CATs are Fuzzy PETs: A Corpus and Analysis of Potentially Euphemistic Terms

May 05, 2022
Martha Gavidia, Patrick Lee, Anna Feldman, Jing Peng

Euphemisms have not received much attention in natural language processing, despite being an important element of polite and figurative language. Euphemisms prove to be a difficult topic, not only because they are subject to language change, but also because humans may not agree on what is a euphemism and what is not. Nevertheless, the first step to tackling the issue is to collect and analyze examples of euphemisms. We present a corpus of potentially euphemistic terms (PETs) along with example texts from the GloWbE corpus. Additionally, we present a subcorpus of texts where these PETs are not being used euphemistically, which may be useful for future applications. We also discuss the results of multiple analyses run on the corpus. Firstly, we find that sentiment analysis on the euphemistic texts supports that PETs generally decrease negative and offensive sentiment. Secondly, we observe cases of disagreement in an annotation task, where humans are asked to label PETs as euphemistic or not in a subset of our corpus text examples. We attribute the disagreement to a variety of potential reasons, including if the PET was a commonly accepted term (CAT).

* Proceedings of LREC 2022 

  Access Paper or Ask Questions

Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters

Apr 15, 2022
Tal Schuster, Sihao Chen, Senaka Buthpitiya, Alex Fabrikant, Donald Metzler

Natural Language Inference (NLI) has been extensively studied by the NLP community as a framework for estimating the semantic relation between sentence pairs. While early work identified certain biases in NLI models, recent advancements in modeling and datasets demonstrated promising performance. In this work, we further explore the direct zero-shot applicability of NLI models to real applications, beyond the sentence-pair setting they were trained on. First, we analyze the robustness of these models to longer and out-of-domain inputs. Then, we develop new aggregation methods to allow operating over full documents, reaching state-of-the-art performance on the ContractNLI dataset. Interestingly, we find NLI scores to provide strong retrieval signals, leading to more relevant evidence extractions compared to common similarity-based methods. Finally, we go further and investigate whole document clusters to identify both discrepancies and consensus among sources. In a test case, we find real inconsistencies between Wikipedia pages in different languages about the same topic.

  Access Paper or Ask Questions

DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization

Apr 13, 2022
Chaoli Wang, Jun Han

Since 2016, we have witnessed the tremendous growth of artificial intelligence+visualization (AI+VIS) research. However, existing survey papers on AI+VIS focus on visual analytics and information visualization, not scientific visualization (SciVis). In this paper, we survey related deep learning (DL) works in SciVis, specifically in the direction of DL4SciVis: designing DL solutions for solving SciVis problems. To stay focused, we primarily consider works that handle scalar and vector field data but exclude mesh data. We classify and discuss these works along six dimensions: domain setting, research task, learning type, network architecture, loss function, and evaluation metric. The paper concludes with a discussion of the remaining gaps to fill along the discussed dimensions and the grand challenges we need to tackle as a community. This state-of-the-art survey guides SciVis researchers in gaining an overview of this emerging topic and points out future directions to grow this research.

* 20 pages, 2 figures, and 12 tables. To Appear in IEEE Transactions on Visualization and Computer Graphics 

  Access Paper or Ask Questions

Long Short-Term Memory for Spatial Encoding in Multi-Agent Path Planning

Mar 21, 2022
Marc R. Schlichting, Stefan Notter, Walter Fichter

Reinforcement learning-based path planning for multi-agent systems of varying size constitutes a research topic with increasing significance as progress in domains such as urban air mobility and autonomous aerial vehicles continues. Reinforcement learning with continuous state and action spaces is used to train a policy network that accommodates desirable path planning behaviors and can be used for time-critical applications. A Long Short-Term Memory module is proposed to encode an unspecified number of states for a varying, indefinite number of agents. The described training strategies and policy architecture lead to a guidance that scales to an infinite number of agents and unlimited physical dimensions, although training takes place at a smaller scale. The guidance is implemented on a low-cost, off-the-shelf onboard computer. The feasibility of the proposed approach is validated by presenting flight test results of up to four drones, autonomously navigating collision-free in a real-world environment.

* AIAA Journal of Guidance, Control, and Dynamics, March 2022 
* For associated source code, see , For associated video of flight test, see , 17 pages, 11 figures 

  Access Paper or Ask Questions

A Survey of Neural Trojan Attacks and Defenses in Deep Learning

Feb 15, 2022
Jie Wang, Ghulam Mubashar Hassan, Naveed Akhtar

Artificial Intelligence (AI) relies heavily on deep learning - a technology that is becoming increasingly popular in real-life applications of AI, even in the safety-critical and high-risk domains. However, it is recently discovered that deep learning can be manipulated by embedding Trojans inside it. Unfortunately, pragmatic solutions to circumvent the computational requirements of deep learning, e.g. outsourcing model training or data annotation to third parties, further add to model susceptibility to the Trojan attacks. Due to the key importance of the topic in deep learning, recent literature has seen many contributions in this direction. We conduct a comprehensive review of the techniques that devise Trojan attacks for deep learning and explore their defenses. Our informative survey systematically organizes the recent literature and discusses the key concepts of the methods while assuming minimal knowledge of the domain on the readers part. It provides a comprehensible gateway to the broader community to understand the recent developments in Neural Trojans.

* 15 pages 

  Access Paper or Ask Questions

Borrowing from yourself: Faster future video segmentation with partial channel update

Feb 11, 2022
Evann Courdier, François Fleuret

Semantic segmentation is a well-addressed topic in the computer vision literature, but the design of fast and accurate video processing networks remains challenging. In addition, to run on embedded hardware, computer vision models often have to make compromises on accuracy to run at the required speed, so that a latency/accuracy trade-off is usually at the heart of these real-time systems' design. For the specific case of videos, models have the additional possibility to make use of computations made for previous frames to mitigate the accuracy loss while being real-time. In this work, we propose to tackle the task of fast future video segmentation prediction through the use of convolutional layers with time-dependent channel masking. This technique only updates a chosen subset of the feature maps at each time-step, bringing simultaneously less computation and latency, and allowing the network to leverage previously computed features. We apply this technique to several fast architectures and experimentally confirm its benefits for the future prediction subtask.

  Access Paper or Ask Questions

Why the Rich Get Richer? On the Balancedness of Random Partition Models

Jan 30, 2022
Changwoo J. Lee, Huiyan Sang

Random partition models are widely used in Bayesian methods for various clustering tasks, such as mixture models, topic models, and community detection problems. While the number of clusters induced by random partition models has been studied extensively, another important model property regarding the balancedness of cluster sizes has been largely neglected. We formulate a framework to define and theoretically study the balancedness of exchangeable random partition models, by analyzing how a model assigns probabilities to partitions with different levels of balancedness. We demonstrate that the "rich-get-richer" characteristic of many existing popular random partition models is an inevitable consequence of two common assumptions: product-form exchangeability and projectivity. We propose a principled way to compare the balancedness of random partition models, which gives a better understanding of what model works better and what doesn't for different applications. We also introduce the "rich-get-poorer" random partition models and illustrate their application to entity resolution tasks.

* 26 pages, 8 figures 

  Access Paper or Ask Questions

Semantics-Preserved Distortion for Personal Privacy Protection

Jan 04, 2022
Letian Peng, Zuchao Li, Hai Zhao

Privacy protection is an important and concerning topic in Federated Learning, especially for Natural Language Processing. In client devices, a large number of texts containing personal information are produced by users every day. As the direct application of information from users is likely to invade personal privacy, many methods have been proposed in Federated Learning to block the center model from the raw information in client devices. In this paper, we try to do this more linguistically via distorting the text while preserving the semantics. In practice, we leverage a recently proposed metric, Neighboring Distribution Divergence, to evaluate the semantic preservation during the distortion. Based on the metric, we propose two frameworks for semantics-preserved distortion, a generative one and a substitutive one. Due to the lack of privacy-related tasks in the current Natural Language Processing field, we conduct experiments on named entity recognition and constituency parsing. Results from our experiments show the plausibility and efficiency of our distortion as a method for personal privacy protection.

  Access Paper or Ask Questions