Alert button
Picture for Mirella Lapata

Mirella Lapata

Alert button

PixT3: Pixel-based Table To Text generation

Nov 16, 2023
Iñigo Alonso, Eneko Agirre, Mirella Lapata

Table-to-Text has been traditionally approached as a linear language to text problem. However, visually represented tables are rich in visual information and serve as a concise, effective form of representing data and its relationships. When using text-based approaches, after the linearization process, this information is either lost or represented in a space inefficient manner. This inefficiency has remained a constant challenge for text-based approaches making them struggle with large tables. In this paper, we demonstrate that image representation of tables are more space-efficient than the typical textual linearizations, and multi-modal approaches are competitive in Table-to-Text tasks. We present PixT3, a multimodal table-to-text model that outperforms the state-of-the-art (SotA) in the ToTTo benchmark in a pure Table-to-Text setting while remaining competitive in controlled Table-to-Text scenarios. It also generalizes better in unseen datasets, outperforming ToTTo SotA in all generation settings. Additionally, we introduce a new intermediate training curriculum to reinforce table structural awareness, leading to improved generation and overall faithfulness of the models.

Viaarxiv icon

Parameter-Efficient Multilingual Summarisation: An Empirical Study

Nov 14, 2023
Chenxi Whitehouse, Fantine Huot, Jasmijn Bastings, Mostafa Dehghani, Chu-Cheng Lin, Mirella Lapata

With the increasing prevalence of Large Language Models, traditional full fine-tuning approaches face growing challenges, especially in memory-intensive tasks. This paper investigates the potential of Parameter-Efficient Fine-Tuning, focusing on Low-Rank Adaptation (LoRA), for complex and under-explored multilingual summarisation tasks. We conduct an extensive study across different data availability scenarios, including full-data, low-data, and cross-lingual transfer, leveraging models of different sizes. Our findings reveal that LoRA lags behind full fine-tuning when trained with full data, however, it excels in low-data scenarios and cross-lingual transfer. Interestingly, as models scale up, the performance gap between LoRA and full fine-tuning diminishes. Additionally, we investigate effective strategies for few-shot cross-lingual transfer, finding that continued LoRA tuning achieves the best performance compared to both full fine-tuning and dynamic composition of language-specific LoRA modules.

Viaarxiv icon

Visual Storytelling with Question-Answer Plans

Oct 17, 2023
Danyang Liu, Mirella Lapata, Frank Keller

Visual storytelling aims to generate compelling narratives from image sequences. Existing models often focus on enhancing the representation of the image sequence, e.g., with external knowledge sources or advanced graph structures. Despite recent progress, the stories are often repetitive, illogical, and lacking in detail. To mitigate these issues, we present a novel framework which integrates visual representations with pretrained language models and planning. Our model translates the image sequence into a visual prefix, a sequence of continuous embeddings which language models can interpret. It also leverages a sequence of question-answer pairs as a blueprint plan for selecting salient visual concepts and determining how they should be assembled into a narrative. Automatic and human evaluation on the VIST benchmark (Huang et al., 2016) demonstrates that blueprint-based models generate stories that are more coherent, interesting, and natural compared to competitive baselines and state-of-the-art systems.

* EMNLP 2023 Findings 
Viaarxiv icon

Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing

Jul 09, 2023
Tom Sherborne, Tom Hosking, Mirella Lapata

Figure 1 for Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing
Figure 2 for Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing
Figure 3 for Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing
Figure 4 for Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing

Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data. Previous work has primarily considered silver-standard data augmentation or zero-shot methods, however, exploiting few-shot gold data is comparatively unexplored. We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between probabilistic latent variables using Optimal Transport. We demonstrate how this direct guidance improves parsing from natural languages using fewer examples and less training. We evaluate our method on two datasets, MTOP and MultiATIS++SQL, establishing state-of-the-art results under a few-shot cross-lingual regime. Ablation studies further reveal that our method improves performance even without parallel input translations. In addition, we show that our model better captures cross-lingual structure in the latent space to improve semantic representation similarity.

* Accepted to TACL 2023. Pre-MIT Press publication. 17 pages, 3 figures, 6 tables 
Viaarxiv icon

$μ$PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge

May 23, 2023
Fantine Huot, Joshua Maynez, Chris Alberti, Reinald Kim Amplayo, Priyanka Agrawal, Constanza Fierro, Shashi Narayan, Mirella Lapata

Figure 1 for $μ$PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge
Figure 2 for $μ$PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge
Figure 3 for $μ$PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge
Figure 4 for $μ$PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge

Cross-lingual summarization consists of generating a summary in one language given an input document in a different language, allowing for the dissemination of relevant content across speakers of other languages. However, this task remains challenging, mainly because of the need for cross-lingual datasets and the compounded difficulty of summarizing and translating. This work presents $\mu$PLAN, an approach to cross-lingual summarization that uses an intermediate planning step as a cross-lingual bridge. We formulate the plan as a sequence of entities that captures the conceptualization of the summary, i.e. identifying the salient content and expressing in which order to present the information, separate from the surface form. Using a multilingual knowledge base, we align the entities to their canonical designation across languages. $\mu$PLAN models first learn to generate the plan and then continue generating the summary conditioned on the plan and the input. We evaluate our methodology on the XWikis dataset on cross-lingual pairs across four languages and demonstrate that this planning objective achieves state-of-the-art performance in terms of ROUGE and faithfulness scores. Moreover, this planning approach improves the zero-shot transfer to new cross-lingual language pairs compared to non-planning baselines.

* EMNLP 2023 Submission 
Viaarxiv icon

Attributable and Scalable Opinion Summarization

May 19, 2023
Tom Hosking, Hao Tang, Mirella Lapata

Figure 1 for Attributable and Scalable Opinion Summarization
Figure 2 for Attributable and Scalable Opinion Summarization
Figure 3 for Attributable and Scalable Opinion Summarization
Figure 4 for Attributable and Scalable Opinion Summarization

We propose a method for unsupervised opinion summarization that encodes sentences from customer reviews into a hierarchical discrete latent space, then identifies common opinions based on the frequency of their encodings. We are able to generate both abstractive summaries by decoding these frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodings. Our method is attributable, because the model identifies sentences used to generate the summary as part of the summarization process. It scales easily to many hundreds of input reviews, because aggregation is performed in the latent space rather than over long sequences of tokens. We also demonstrate that our appraoch enables a degree of control, generating aspect-specific summaries by restricting the model to parts of the encoding space that correspond to desired aspects (e.g., location or food). Automatic and human evaluation on two datasets from different domains demonstrates that our method generates summaries that are more informative than prior work and better grounded in the input reviews.

* ACL 2023 
Viaarxiv icon

Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

May 17, 2023
Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata

Figure 1 for Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Figure 2 for Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Figure 3 for Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Figure 4 for Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

We study whether multiple large language models (LLMs) can autonomously improve each other in a negotiation game by playing, reflecting, and criticizing. We are interested in this question because if LLMs were able to improve each other, it would imply the possibility of creating strong AI agents with minimal human intervention. We ask two LLMs to negotiate with each other, playing the roles of a buyer and a seller, respectively. They aim to reach a deal with the buyer targeting a lower price and the seller a higher one. A third language model, playing the critic, provides feedback to a player to improve the player's negotiation strategies. We let the two agents play multiple rounds, using previous negotiation history and AI feedback as in-context demonstrations to improve the model's negotiation strategy iteratively. We use different LLMs (GPT and Claude) for different roles and use the deal price as the evaluation metric. Our experiments reveal multiple intriguing findings: (1) Only a subset of the language models we consider can self-play and improve the deal price from AI feedback, weaker models either do not understand the game's rules or cannot incorporate AI feedback for further improvement. (2) Models' abilities to learn from the feedback differ when playing different roles. For example, it is harder for Claude-instant to improve as the buyer than as the seller. (3) When unrolling the game to multiple rounds, stronger agents can consistently improve their performance by meaningfully using previous experiences and iterative AI feedback, yet have a higher risk of breaking the deal. We hope our work provides insightful initial explorations of having models autonomously improve each other with game playing and AI feedback.

* Preprint. Code at https://github.com/FranxYao/GPT-Bargaining 
Viaarxiv icon

Conversational Semantic Parsing using Dynamic Context Graphs

May 04, 2023
Parag Jain, Mirella Lapata

Figure 1 for Conversational Semantic Parsing using Dynamic Context Graphs
Figure 2 for Conversational Semantic Parsing using Dynamic Context Graphs
Figure 3 for Conversational Semantic Parsing using Dynamic Context Graphs
Figure 4 for Conversational Semantic Parsing using Dynamic Context Graphs

In this paper we consider the task of conversational semantic parsing over general purpose knowledge graphs (KGs) with millions of entities, and thousands of relation-types. We are interested in developing models capable of interactively mapping user utterances into executable logical forms (e.g., SPARQL) in the context of the conversational history. Our key idea is to represent information about an utterance and its context via a subgraph which is created dynamically, i.e., the number of nodes varies per utterance. Moreover, rather than treating the subgraph as a sequence we exploit its underlying structure, and thus encode it using a graph neural network which further allows us to represent a large number of (unseen) nodes. Experimental results show that modeling context dynamically is superior to static approaches, delivering performance improvements across the board (i.e., for simple and complex questions). Our results further confirm that modeling the structure of context is better at processing discourse information, (i.e., at handling ellipsis and resolving coreference) and longer interactions.

Viaarxiv icon

Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation

Apr 28, 2023
Fantine Huot, Joshua Maynez, Shashi Narayan, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Anders Sandholm, Dipanjan Das, Mirella Lapata

Figure 1 for Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation
Figure 2 for Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation
Figure 3 for Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation
Figure 4 for Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation

While conditional generation models can now generate natural language well enough to create fluent text, it is still difficult to control the generation process, leading to irrelevant, repetitive, and hallucinated content. Recent work shows that planning can be a useful intermediate step to render conditional generation less opaque and more grounded. We present a web browser-based demonstration for query-focused summarization that uses a sequence of question-answer pairs, as a blueprint plan for guiding text generation (i.e., what to say and in what order). We illustrate how users may interact with the generated text and associated plan visualizations, e.g., by editing and modifying the blueprint in order to improve or control the generated output. A short video demonstrating our system is available at https://goo.gle/text-blueprint-demo.

* Accepted at EACL Call for System Demonstrations 2023 
Viaarxiv icon