We formulate a stochastic process, FiLex, as a mathematical model of lexicon entropy in deep learning-based emergent language systems. Defining a model mathematically allows it to generate clear predictions which can be directly and decisively tested. We empirically verify across four different environments that FiLex predicts the correct correlation between hyperparameters (training steps, lexicon size, learning rate, rollout buffer size, and Gumbel-Softmax temperature) and the emergent language's entropy in 20 out of 20 environment-hyperparameter combinations. Furthermore, our experiments reveal that different environments show diverse relationships between their hyperparameters and entropy which demonstrates the need for a model which can make well-defined predictions at a precise level of granularity.
Emergent language is unique among fields within the discipline of machine learning for its open-endedness, not obviously presenting well-defined problems to be solved. As a result, the current research in the field has largely been exploratory: focusing on establishing new problems, techniques, and phenomena. Yet after these problems have been established, subsequent progress requires research which can measurably demonstrate how it improves on prior approaches. This type of research is what we call systematic research; in this paper, we illustrate this mode of research specifically for emergent language. We first identify the overarching goals of emergent language research, categorizing them as either science or engineering. Using this distinction, we present core methodological elements of science and engineering, analyze their role in current emergent language research, and recommend how to apply these elements.
We introduce FiLex, a self-reinforcing stochastic process which models finite lexicons in emergent language experiments. The central property of FiLex is that it is a self-reinforcing process, parallel to the intuition that the more a word is used in a language, the more its use will continue. As a theoretical model, FiLex serves as a way to both explain and predict the behavior of the emergent language system. We empirically test FiLex's ability to capture the relationship between the emergent language's hyperparameters and the lexicon's Shannon entropy.
Recent work in natural language processing (NLP) has focused on ethical challenges such as understanding and mitigating bias in data and algorithms; identifying objectionable content like hate speech, stereotypes and offensive language; and building frameworks for better system design and data handling practices. However, there has been little discussion about the ethical foundations that underlie these efforts. In this work, we study one ethical theory, namely deontological ethics, from the perspective of NLP. In particular, we focus on the generalization principle and the respect for autonomy through informed consent. We provide four case studies to demonstrate how these principles can be used with NLP systems. We also recommend directions to avoid the ethical issues in these systems.
An implicit association test is a human psychological test used to measure subconscious associations. While widely recognized by psychologists as an effective tool in measuring attitudes and biases, the validity of the results can be compromised if a subject does not follow the instructions or attempts to manipulate the outcome. Compared to previous work, we collect training data using a more generalized methodology. We train a variety of different classifiers to identify a participant's first attempt versus a second possibly compromised attempt. To compromise the second attempt, participants are shown their score and are instructed to change it using one of five randomly selected deception methods. Compared to previous work, our methodology demonstrates a more robust and practical framework for accurately identifying a wide variety of deception techniques applicable to the IAT.
Recurrent neural networks (RNNs), specifically long-short term memory networks (LSTMs), can model natural language effectively. This research investigates the ability for these same LSTMs to perform next "word" prediction on the Java programming language. Java source code from four different repositories undergoes a transformation that preserves the logical structure of the source code and removes the code's various specificities such as variable names and literal values. Such datasets and an additional English language corpus are used to train and test standard LSTMs' ability to predict the next element in a sequence. Results suggest that LSTMs can effectively model Java code achieving perplexities under 22 and accuracies above 0.47, which is an improvement over LSTM's performance on the English language which demonstrated a perplexity of 85 and an accuracy of 0.27. This research can have applicability in other areas such as syntactic template suggestion and automated bug patching.
We present Flipper, a natural language interface for describing high-level task specifications for robots that are compiled into robot actions. Flipper starts with a formal core language for task planning that allows expressing rich temporal specifications and uses a semantic parser to provide a natural language interface. Flipper provides immediate visual feedback by executing an automatically constructed plan of the task in a graphical user interface. This allows the user to resolve potentially ambiguous interpretations. Flipper extends itself via naturalization: its users can add definitions for utterances, from which Flipper induces new rules and adds them to the core language, gradually growing a more and more natural task specification language. Flipper improves the naturalization by generalizing the definition provided by users. Unlike other task-specification systems, Flipper enables natural language interactions while maintaining the expressive power and formal precision of a programming language. We show through an initial user study that natural language interactions and generalization can considerably ease the description of tasks. Moreover, over time, users employ more and more concepts outside of the initial core language. Such extensions are available to the Flipper community, and users can use concepts that others have defined.