Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zachary M. Ziegler

Encoder-Agnostic Adaptation for Conditional Language Generation

Sep 11, 2019

Zachary M. Ziegler, Luke Melas-Kyriazi, Sebastian Gehrmann, Alexander M. Rush

Figure 1 for Encoder-Agnostic Adaptation for Conditional Language Generation

Figure 2 for Encoder-Agnostic Adaptation for Conditional Language Generation

Figure 3 for Encoder-Agnostic Adaptation for Conditional Language Generation

Figure 4 for Encoder-Agnostic Adaptation for Conditional Language Generation

Abstract:Large pretrained language models have changed the way researchers approach discriminative natural language understanding tasks, leading to the dominance of approaches that adapt a pretrained model for arbitrary downstream tasks. However it is an open-question how to use similar techniques for language generation. Early results in the encoder-agnostic setting have been mostly negative. In this work we explore methods for adapting a pretrained language model to arbitrary conditional input. We observe that pretrained transformer models are sensitive to large parameter changes during tuning. We therefore propose an adaptation that directly injects arbitrary conditioning into self attention, an approach we call pseudo self attention. Through experiments on four diverse conditional text generation tasks we show that this encoder-agnostic technique outperforms strong baselines, produces coherent generations, and is data efficient.

Via

Access Paper or Ask Questions

Neural Linguistic Steganography

Sep 03, 2019

Zachary M. Ziegler, Yuntian Deng, Alexander M. Rush

Figure 1 for Neural Linguistic Steganography

Figure 2 for Neural Linguistic Steganography

Figure 3 for Neural Linguistic Steganography

Figure 4 for Neural Linguistic Steganography

Abstract:Whereas traditional cryptography encrypts a secret message into an unintelligible form, steganography conceals that communication is taking place by encoding a secret message into a cover signal. Language is a particularly pragmatic cover signal due to its benign occurrence and independence from any one medium. Traditionally, linguistic steganography systems encode secret messages in existing text via synonym substitution or word order rearrangements. Advances in neural language models enable previously impractical generation-based techniques. We propose a steganography technique based on arithmetic coding with large-scale neural language models. We find that our approach can generate realistic looking cover sentences as evaluated by humans, while at the same time preserving security by matching the cover message distribution with the language model distribution.

* EMNLP 2019 Accepted

Via

Access Paper or Ask Questions

Latent Normalizing Flows for Discrete Sequences

Jan 29, 2019

Zachary M. Ziegler, Alexander M. Rush

Figure 1 for Latent Normalizing Flows for Discrete Sequences

Figure 2 for Latent Normalizing Flows for Discrete Sequences

Figure 3 for Latent Normalizing Flows for Discrete Sequences

Figure 4 for Latent Normalizing Flows for Discrete Sequences

Abstract:Normalizing flows have been shown to be a powerful class of generative models for continuous random variables, giving both strong performance and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.

Via

Access Paper or Ask Questions