Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

Dec 23, 2019
Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, Rosanne Liu

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM's hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.


  Access Paper or Ask Questions

Plug and Play Language Models: a Simple Approach to Controlled Text Generation

Dec 04, 2019
Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, Rosanne Liu

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM's hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.


  Access Paper or Ask Questions

Deep Networks with Shape Priors for Nucleus Detection

Jun 29, 2018
Mohammad Tofighi, Tiantong Guo, Jairam K. P. Vanamala, Vishal Monga

Detection of cell nuclei in microscopic images is a challenging research topic, because of limitations in cellular image quality and diversity of nuclear morphology, i.e. varying nuclei shapes, sizes, and overlaps between multiple cell nuclei. This has been a topic of enduring interest with promising recent success shown by deep learning methods. These methods train for example convolutional neural networks (CNNs) with a training set of input images and known, labeled nuclei locations. Many of these methods are supplemented by spatial or morphological processing. We develop a new approach that we call Shape Priors with Convolutional Neural Networks (SP-CNN) to perform significantly enhanced nuclei detection. A set of canonical shapes is prepared with the help of a domain expert. Subsequently, we present a new network structure that can incorporate `expected behavior' of nucleus shapes via two components: {\em learnable} layers that perform the nucleus detection and a {\em fixed} processing part that guides the learning with prior information. Analytically, we formulate a new regularization term that is targeted at penalizing false positives while simultaneously encouraging detection inside cell nucleus boundary. Experimental results on a challenging dataset reveal that SP-CNN is competitive with or outperforms several state-of-the-art methods.

* Accepted paper to 2018 IEEE International Conference on Image Processing (ICIP 2018) 

  Access Paper or Ask Questions

Gender identity and lexical variation in social media

May 12, 2014
David Bamman, Jacob Eisenstein, Tyler Schnoebelen

We present a study of the relationship between gender, linguistic style, and social networks, using a novel corpus of 14,000 Twitter users. Prior quantitative work on gender often treats this social variable as a female/male binary; we argue for a more nuanced approach. By clustering Twitter users, we find a natural decomposition of the dataset into various styles and topical interests. Many clusters have strong gender orientations, but their use of linguistic resources sometimes directly conflicts with the population-level language statistics. We view these clusters as a more accurate reflection of the multifaceted nature of gendered language styles. Previous corpus-based work has also had little to say about individuals whose linguistic styles defy population-level gender patterns. To identify such individuals, we train a statistical classifier, and measure the classifier confidence for each individual in the dataset. Examining individuals whose language does not match the classifier's model for their gender, we find that they have social networks that include significantly fewer same-gender social connections and that, in general, social network homophily is correlated with the use of same-gender language markers. Pairing computational methods and social theory thus offers a new perspective on how gender emerges as individuals position themselves relative to audiences, topics, and mainstream gender norms.

* Journal of Sociolinguistics 18 (2014) 135-160 
* submission version 

  Access Paper or Ask Questions

Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or Something Else?

Nov 09, 2021
Alexander Michael Daniel

Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms. Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications. By leveraging an already-available analyst as a human-in-the-loop, however, the canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system. This paper aims to determine which of these techniques is best suited for this purpose and how each technique might best be used towards this end. Training datasets of the same size and nearly identical neural architectures (a BERT transformer as a word embedder with a single feed-forward layer thereafter) are used for each approach, which are then tested on sentiment- and stance-specific datasets to establish a baseline of how well each method can be used to do the other tasks. Four different datasets relating to COVID-19 disinformation are used to test the ability of each technique to detect disinformation on a topic that did not appear in the training data set. Quantitative and qualitative results from these tests are then used to provide insight into how best to employ these techniques in practice.

* 15 pages + references. Presented at the 26th International Command and Control Research and Technology Symposium, 18 October 2021 

  Access Paper or Ask Questions

Dual-Arm Adversarial Robot Learning

Oct 15, 2021
Elie Aljalbout

Robot learning is a very promising topic for the future of automation and machine intelligence. Future robots should be able to autonomously acquire skills, learn to represent their environment, and interact with it. While these topics have been explored in simulation, real-world robot learning research seems to be still limited. This is due to the additional challenges encountered in the real-world, such as noisy sensors and actuators, safe exploration, non-stationary dynamics, autonomous environment resetting as well as the cost of running experiments for long periods of time. Unless we develop scalable solutions to these problems, learning complex tasks involving hand-eye coordination and rich contacts will remain an untouched vision that is only feasible in controlled lab environments. We propose dual-arm settings as platforms for robot learning. Such settings enable safe data collection for acquiring manipulation skills as well as training perception modules in a robot-supervised manner. They also ease the processes of resetting the environment. Furthermore, adversarial learning could potentially boost the generalization capability of robot learning methods by maximizing the exploration based on game-theoretic objectives while ensuring safety based on collaborative task spaces. In this paper, we will discuss the potential benefits of this setup as well as the challenges and research directions that can be pursued.

* Accepted at CoRL 2021, Blue Sky Track 

  Access Paper or Ask Questions

eDarkTrends: Harnessing Social Media Trends in Substance use disorders for Opioid Listings on Cryptomarket

Mar 29, 2021
Usha Lokala, Francois Lamy, Triyasha Ghosh Dastidar, Kaushik Roy, Raminta Daniulaityte, Srinivasan Parthasarathy, Amit Sheth

Opioid and substance misuse is rampant in the United States today, with the phenomenon known as the opioid crisis. The relationship between substance use and mental health has been extensively studied, with one possible relationship being substance misuse causes poor mental health. However, the lack of evidence on the relationship has resulted in opioids being largely inaccessible through legal means. This study analyzes the substance misuse posts on social media with the opioids being sold through crypto market listings. We use the Drug Abuse Ontology, state-of-the-art deep learning, and BERT-based models to generate sentiment and emotion for the social media posts to understand user perception on social media by investigating questions such as, which synthetic opioids people are optimistic, neutral, or negative about or what kind of drugs induced fear and sorrow or what kind of drugs people love or thankful about or which drug people think negatively about or which opioids cause little to no sentimental reaction. We also perform topic analysis associated with the generated sentiments and emotions to understand which topics correlate with people's responses to various drugs. Our findings can help shape policy to help isolate opioid use cases where timely intervention may be required to prevent adverse consequences, prevent overdose-related deaths, and worsen the epidemic.

* 6 pages, ICLR AI for Public Health Workshop 2021 

  Access Paper or Ask Questions

Automatic Extraction of Urban Outdoor Perception from Geolocated Free-Texts

Oct 13, 2020
Frances Santos, Thiago H Silva, Antonio A F Loureiro, Leandro Villas

The automatic extraction of urban perception shared by people on location-based social networks (LBSNs) is an important multidisciplinary research goal. One of the reasons is because it facilitates the understanding of the intrinsic characteristics of urban areas in a scalable way, helping to leverage new services. However, content shared on LBSNs is diverse, encompassing several topics, such as politics, sports, culture, religion, and urban perceptions, making the task of content extraction regarding a particular topic very challenging. Considering free-text messages shared on LBSNs, we propose an automatic and generic approach to extract people's perceptions. For that, our approach explores opinions that are spatial-temporal and semantically similar. We exemplify our approach in the context of urban outdoor areas in Chicago, New York City and London. Studying those areas, we found evidence that LBSN data brings valuable information about urban regions. To analyze and validate our outcomes, we conducted a temporal analysis to measure the results' robustness over time. We show that our approach can be helpful to better understand urban areas considering different perspectives. We also conducted a comparative analysis based on a public dataset, which contains volunteers' perceptions regarding urban areas expressed in a controlled experiment. We observe that both results yield a very similar level of agreement.

* Paper accepted - to be published 

  Access Paper or Ask Questions

DFEW: A Large-Scale Database for Recognizing Dynamic Facial Expressions in the Wild

Aug 13, 2020
Xingxun Jiang, Yuan Zong, Wenming Zheng, Chuangao Tang, Wanchuang Xia, Cheng Lu, Jiateng Liu

Recently, facial expression recognition (FER) in the wild has gained a lot of researchers' attention because it is a valuable topic to enable the FER techniques to move from the laboratory to the real applications. In this paper, we focus on this challenging but interesting topic and make contributions from three aspects. First, we present a new large-scale 'in-the-wild' dynamic facial expression database, DFEW (Dynamic Facial Expression in the Wild), consisting of over 16,000 video clips from thousands of movies. These video clips contain various challenging interferences in practical scenarios such as extreme illumination, occlusions, and capricious pose changes. Second, we propose a novel method called Expression-Clustered Spatiotemporal Feature Learning (EC-STFL) framework to deal with dynamic FER in the wild. Third, we conduct extensive benchmark experiments on DFEW using a lot of spatiotemporal deep feature learning methods as well as our proposed EC-STFL. Experimental results show that DFEW is a well-designed and challenging database, and the proposed EC-STFL can promisingly improve the performance of existing spatiotemporal deep neural networks in coping with the problem of dynamic FER in the wild. Our DFEW database is publicly available and can be freely downloaded from https://dfew-dataset.github.io/.


  Access Paper or Ask Questions

Embeddings-Based Clustering for Target Specific Stances: The Case of a Polarized Turkey

May 19, 2020
Ammar Rashed, Mucahid Kutlu, Kareem Darwish, Tamer Elsayed, Cansın Bayrak

On June 24, 2018, Turkey conducted a highly consequential election in which the Turkish people elected their president and parliament in the first election under a new presidential system. During the election period, the Turkish people extensively shared their political opinions on Twitter. One aspect of polarization among the electorate was support for or opposition to the reelection of Recep Tayyip Erdo\u{g}an. In this paper, we present an unsupervised method for target-specific stance detection in a polarized setting, specifically Turkish politics, achieving 90% precision in identifying user stances, while maintaining more than 80% recall. The method involves representing users in an embedding space using Google's Convolutional Neural Network (CNN) based multilingual universal sentence encoder. The representations are then projected onto a lower dimensional space in a manner that reflects similarities and are consequently clustered. We show the effectiveness of our method in properly clustering users of divergent groups across multiple targets that include political figures, different groups, and parties. We perform our analysis on a large dataset of 108M Turkish election-related tweets along with the timeline tweets of 168k Turkish users, who authored 213M tweets. Given the resultant user stances, we are able to observe correlations between topics and compute topic polarization.

* arXiv admin note: text overlap with arXiv:1909.10213 

  Access Paper or Ask Questions

<<
142
143
144
145
146
147
148
149
150
151
152
153
154
>>