Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Antonia Creswell

Solving math word problems with process- and outcome-based feedback

Nov 25, 2022

Jonathan Uesato, Nate Kushman, Ramana Kumar, Francis Song, Noah Siegel, Lisa Wang, Antonia Creswell, Geoffrey Irving, Irina Higgins

Figure 1 for Solving math word problems with process- and outcome-based feedback

Figure 2 for Solving math word problems with process- and outcome-based feedback

Figure 3 for Solving math word problems with process- and outcome-based feedback

Figure 4 for Solving math word problems with process- and outcome-based feedback

Abstract:Recent work has shown that asking language models to generate reasoning steps improves performance on many reasoning tasks. When moving beyond prompting, this raises the question of how we should supervise such models: outcome-based approaches which supervise the final result, or process-based approaches which supervise the reasoning process itself? Differences between these approaches might naturally be expected not just in final-answer errors but also in reasoning errors, which can be difficult to detect and are problematic in many real-world domains such as education. We run the first comprehensive comparison between process- and outcome-based approaches trained on a natural language task, GSM8K. We find that pure outcome-based supervision produces similar final-answer error rates with less label supervision. However, for correct reasoning steps we find it necessary to use process-based supervision or supervision from learned reward models that emulate process-based feedback. In total, we improve the previous best results from 16.8% $\to$ 12.7% final-answer error and 14.0% $\to$ 3.4% reasoning error among final-answer-correct solutions.

Via

Access Paper or Ask Questions

Faithful Reasoning Using Large Language Models

Aug 30, 2022

Antonia Creswell, Murray Shanahan

Figure 1 for Faithful Reasoning Using Large Language Models

Figure 2 for Faithful Reasoning Using Large Language Models

Figure 3 for Faithful Reasoning Using Large Language Models

Figure 4 for Faithful Reasoning Using Large Language Models

Abstract:Although contemporary large language models (LMs) demonstrate impressive question-answering capabilities, their answers are typically the product of a single call to the model. This entails an unwelcome degree of opacity and compromises performance, especially on problems that are inherently multi-step. To address these limitations, we show how LMs can be made to perform faithful multi-step reasoning via a process whose causal structure mirrors the underlying logical structure of the problem. Our approach works by chaining together reasoning steps, where each step results from calls to two fine-tuned LMs, one for selection and one for inference, to produce a valid reasoning trace. Our method carries out a beam search through the space of reasoning traces to improve reasoning quality. We demonstrate the effectiveness of our model on multi-step logical deduction and scientific question-answering, showing that it outperforms baselines on final answer accuracy, and generates humanly interpretable reasoning traces whose validity can be checked by the user.

Via

Access Paper or Ask Questions

Language models show human-like content effects on reasoning

Jul 14, 2022

Ishita Dasgupta, Andrew K. Lampinen, Stephanie C. Y. Chan, Antonia Creswell, Dharshan Kumaran, James L. McClelland, Felix Hill

Figure 1 for Language models show human-like content effects on reasoning

Figure 2 for Language models show human-like content effects on reasoning

Figure 3 for Language models show human-like content effects on reasoning

Figure 4 for Language models show human-like content effects on reasoning

Abstract:Abstract reasoning is a key ability for an intelligent system. Large language models achieve above-chance performance on abstract reasoning tasks, but exhibit many imperfections. However, human abstract reasoning is also imperfect, and depends on our knowledge and beliefs about the content of the reasoning problem. For example, humans reason much more reliably about logical rules that are grounded in everyday situations than arbitrary rules about abstract attributes. The training experiences of language models similarly endow them with prior expectations that reflect human knowledge and beliefs. We therefore hypothesized that language models would show human-like content effects on abstract reasoning problems. We explored this hypothesis across three logical reasoning tasks: natural language inference, judging the logical validity of syllogisms, and the Wason selection task (Wason, 1968). We find that state of the art large language models (with 7 or 70 billion parameters; Hoffman et al., 2022) reflect many of the same patterns observed in humans across these tasks -- like humans, models reason more effectively about believable situations than unrealistic or abstract ones. Our findings have implications for understanding both these cognitive effects, and the factors that contribute to language model performance.

Via

Access Paper or Ask Questions

Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning

May 19, 2022

Antonia Creswell, Murray Shanahan, Irina Higgins

Figure 1 for Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning

Figure 2 for Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning

Figure 3 for Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning

Figure 4 for Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning

Abstract:Large language models (LLMs) have been shown to be capable of impressive few-shot generalisation to new tasks. However, they still tend to perform poorly on multi-step logical reasoning problems. Here we carry out a comprehensive evaluation of LLMs on 50 tasks that probe different aspects of logical reasoning. We show that language models tend to perform fairly well at single step inference or entailment tasks, but struggle to chain together multiple reasoning steps to solve more complex problems. In light of this, we propose a Selection-Inference (SI) framework that exploits pre-trained LLMs as general processing modules, and alternates between selection and inference to generate a series of interpretable, casual reasoning steps leading to the final answer. We show that a 7B parameter LLM used within the SI framework in a 5-shot generalisation setting, with no fine-tuning, yields a performance improvement of over 100% compared to an equivalent vanilla baseline on a suite of 10 logical reasoning tasks. The same model in the same setting even outperforms a significantly larger 280B parameter baseline on the same suite of tasks. Moreover, answers produced by the SI framework are accompanied by a causal natural-language-based reasoning trace, which has important implications for the safety and trustworthiness of the system.

Via

Access Paper or Ask Questions

Can language models learn from explanations in context?

Apr 05, 2022

Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill

Figure 1 for Can language models learn from explanations in context?

Figure 2 for Can language models learn from explanations in context?

Figure 3 for Can language models learn from explanations in context?

Figure 4 for Can language models learn from explanations in context?

Abstract:Large language models can perform new tasks by adapting to a few in-context examples. For humans, rapid learning from examples can benefit from explanations that connect examples to task principles. We therefore investigate whether explanations of few-shot examples can allow language models to adapt more effectively. We annotate a set of 40 challenging tasks from BIG-Bench with explanations of answers to a small subset of questions, as well as a variety of matched control explanations. We evaluate the effects of various zero-shot and few-shot prompts that include different types of explanations, instructions, and controls on the performance of a range of large language models. We analyze these results using statistical multilevel modeling techniques that account for the nested dependencies among conditions, tasks, prompts, and models. We find that explanations of examples can improve performance. Adding untuned explanations to a few-shot prompt offers a modest improvement in performance; about 1/3 the effect size of adding few-shot examples, but twice the effect size of task instructions. We then show that explanations tuned for performance on a small validation set offer substantially larger benefits; building a prompt by selecting examples and explanations together substantially improves performance over selecting examples alone. Hand-tuning explanations can substantially improve performance on challenging tasks. Furthermore, even untuned explanations outperform carefully matched controls, suggesting that the benefits are due to the link between an example and its explanation, rather than lower-level features of the language used. However, only large models can benefit from explanations. In summary, explanations can support the in-context learning abilities of large language models on

Via

Access Paper or Ask Questions

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Dec 08, 2021

Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young(+70 more)

Figure 1 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Figure 2 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Figure 3 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Figure 4 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Abstract:Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gopher. These models are evaluated on 152 diverse tasks, achieving state-of-the-art performance across the majority. Gains from scale are largest in areas such as reading comprehension, fact-checking, and the identification of toxic language, but logical and mathematical reasoning see less benefit. We provide a holistic analysis of the training dataset and model's behaviour, covering the intersection of model scale with bias and toxicity. Finally we discuss the application of language models to AI safety and the mitigation of downstream harms.

* 118 pages

Via

Access Paper or Ask Questions

SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

Jun 07, 2021

Rishabh Kabra, Daniel Zoran, Goker Erdogan, Loic Matthey, Antonia Creswell, Matthew Botvinick, Alexander Lerchner, Christopher P. Burgess

Figure 1 for SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

Figure 2 for SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

Figure 3 for SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

Figure 4 for SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

Abstract:To help agents reason about scenes in terms of their building blocks, we wish to extract the compositional structure of any given scene (in particular, the configuration and characteristics of objects comprising the scene). This problem is especially difficult when scene structure needs to be inferred while also estimating the agent's location/viewpoint, as the two variables jointly give rise to the agent's observations. We present an unsupervised variational approach to this problem. Leveraging the shared structure that exists across different scenes, our model learns to infer two sets of latent representations from RGB video input alone: a set of "object" latents, corresponding to the time-invariant, object-level contents of the scene, as well as a set of "frame" latents, corresponding to global time-varying elements such as viewpoint. This factorization of latents allows our model, SIMONe, to represent object attributes in an allocentric manner which does not depend on viewpoint. Moreover, it allows us to disentangle object dynamics and summarize their trajectories as time-abstracted, view-invariant, per-object properties. We demonstrate these capabilities, as well as the model's performance in terms of view synthesis and instance segmentation, across three procedurally generated video datasets.

* Animated figures are available at https://sites.google.com/view/simone-scene-understanding/

Via

Access Paper or Ask Questions

Unsupervised Object-Based Transition Models for 3D Partially Observable Environments

Mar 08, 2021

Antonia Creswell, Rishabh Kabra, Chris Burgess, Murray Shanahan

Figure 1 for Unsupervised Object-Based Transition Models for 3D Partially Observable Environments

Figure 2 for Unsupervised Object-Based Transition Models for 3D Partially Observable Environments

Figure 3 for Unsupervised Object-Based Transition Models for 3D Partially Observable Environments

Figure 4 for Unsupervised Object-Based Transition Models for 3D Partially Observable Environments

Abstract:We present a slot-wise, object-based transition model that decomposes a scene into objects, aligns them (with respect to a slot-wise object memory) to maintain a consistent order across time, and predicts how those objects evolve over successive frames. The model is trained end-to-end without supervision using losses at the level of the object-structured representation rather than pixels. Thanks to its alignment module, the model deals properly with two issues that are not handled satisfactorily by other transition models, namely object persistence and object identity. We show that the combination of an object-level loss and correct object alignment over time enables the model to outperform a state-of-the-art baseline, and allows it to deal well with object occlusion and re-appearance in partially observable environments.

Via

Access Paper or Ask Questions

AlignNet: Unsupervised Entity Alignment

Jul 21, 2020

Antonia Creswell, Kyriacos Nikiforou, Oriol Vinyals, Andre Saraiva, Rishabh Kabra, Loic Matthey, Chris Burgess, Malcolm Reynolds, Richard Tanburn, Marta Garnelo(+1 more)

Figure 1 for AlignNet: Unsupervised Entity Alignment

Figure 2 for AlignNet: Unsupervised Entity Alignment

Figure 3 for AlignNet: Unsupervised Entity Alignment

Figure 4 for AlignNet: Unsupervised Entity Alignment

Abstract:Recently developed deep learning models are able to learn to segment scenes into component objects without supervision. This opens many new and exciting avenues of research, allowing agents to take objects (or entities) as inputs, rather that pixels. Unfortunately, while these models provide excellent segmentation of a single frame, they do not keep track of how objects segmented at one time-step correspond (or align) to those at a later time-step. The alignment (or correspondence) problem has impeded progress towards using object representations in downstream tasks. In this paper we take steps towards solving the alignment problem, presenting the AlignNet, an unsupervised alignment module.

Via

Access Paper or Ask Questions

An Explicitly Relational Neural Network Architecture

May 24, 2019

Murray Shanahan, Kyriacos Nikiforou, Antonia Creswell, Christos Kaplanis, David Barrett, Marta Garnelo

Figure 1 for An Explicitly Relational Neural Network Architecture

Figure 2 for An Explicitly Relational Neural Network Architecture

Figure 3 for An Explicitly Relational Neural Network Architecture

Figure 4 for An Explicitly Relational Neural Network Architecture

Abstract:With a view to bridging the gap between deep learning and symbolic AI, we present a novel end-to-end neural network architecture that learns to form propositional representations with an explicitly relational structure from raw pixel data. In order to evaluate and analyse the architecture, we introduce a family of simple visual relational reasoning tasks of varying complexity. We show that the proposed architecture, when pre-trained on a curriculum of such tasks, learns to generate reusable representations that better facilitate subsequent learning on previously unseen tasks when compared to a number of baseline architectures. The workings of a successfully trained model are visualised to shed some light on how the architecture functions.

Via

Access Paper or Ask Questions