Alert button
Picture for Alexander Hanbo Li

Alexander Hanbo Li

Alert button

Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning

Aug 10, 2023
Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang

Figure 1 for Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning
Figure 2 for Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning
Figure 3 for Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning
Figure 4 for Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning

We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods that primarily focus on specific types of structured data. Our proposed method aims to improve performance in multi-task training, zero-shot and few-shot scenarios by providing a unified representation that can handle various forms of structured data such as tables, knowledge graph triples, and meaning representations. We demonstrate that our proposed approach can effectively adapt to new structured forms, and can improve performance in comparison to current methods. For example, our method resulted in a 66% improvement in zero-shot BLEU scores when transferring models trained on table inputs to a knowledge graph dataset. Our proposed method is an important step towards a more general data-to-text generation framework.

Viaarxiv icon

Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

May 30, 2023
Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang

Figure 1 for Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Figure 2 for Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Figure 3 for Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Figure 4 for Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge. Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to the task and shown to be powerful world knowledge sources. However, these methods suffer from low knowledge coverage caused by PLM bias -- the tendency to generate certain tokens over other tokens regardless of prompt changes, and high dependency on the PLM quality -- only models using GPT-3 can achieve the best result. To address the aforementioned challenges, we propose RASO: a new VQA pipeline that deploys a generate-then-select strategy guided by world knowledge for the first time. Rather than following the de facto standard to train a multi-modal model that directly generates the VQA answer, RASO first adopts PLM to generate all the possible answers, and then trains a lightweight answer selection model for the correct answer. As proved in our analysis, RASO expands the knowledge coverage from in-domain training data by a large margin. We provide extensive experimentation and show the effectiveness of our pipeline by advancing the state-of-the-art by 4.1% on OK-VQA, without additional computation cost. Code and models are released at http://cogcomp.org/page/publication_view/1010

* Accepted to ACL 2023 Findings 
Viaarxiv icon

Benchmarking Diverse-Modal Entity Linking with Generative Models

May 27, 2023
Sijia Wang, Alexander Hanbo Li, Henry Zhu, Sheng Zhang, Chung-Wei Hang, Pramuditha Perera, Jie Ma, William Wang, Zhiguo Wang, Vittorio Castelli, Bing Xiang, Patrick Ng

Figure 1 for Benchmarking Diverse-Modal Entity Linking with Generative Models
Figure 2 for Benchmarking Diverse-Modal Entity Linking with Generative Models
Figure 3 for Benchmarking Diverse-Modal Entity Linking with Generative Models
Figure 4 for Benchmarking Diverse-Modal Entity Linking with Generative Models

Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables. While existing entity linking (EL) models work well on per modality configuration, such as text-only EL, visual grounding, or schema linking, it is more challenging to design a unified model for diverse modality configurations. To bring various modality configurations together, we constructed a benchmark for diverse-modal EL (DMEL) from existing EL datasets, covering all three modalities including text, image, and table. To approach the DMEL task, we proposed a generative diverse-modal model (GDMM) following a multimodal-encoder-decoder paradigm. Pre-training \Model with rich corpora builds a solid foundation for DMEL without storing the entire KB for inference. Fine-tuning GDMM builds a stronger DMEL baseline, outperforming state-of-the-art task-specific EL models by 8.51 F1 score on average. Additionally, extensive error analyses are conducted to highlight the challenges of DMEL, facilitating future research on this task.

* 15 pages. ACL 2023 
Viaarxiv icon

Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

Jan 21, 2023
Shuaichen Chang, Jun Wang, Mingwen Dong, Lin Pan, Henghui Zhu, Alexander Hanbo Li, Wuwei Lan, Sheng Zhang, Jiarong Jiang, Joseph Lilien, Steve Ash, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Bing Xiang

Figure 1 for Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness
Figure 2 for Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness
Figure 3 for Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness
Figure 4 for Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries. However, recent studies reveal that text-to-SQL models are vulnerable to task-specific perturbations. Previous curated robustness test sets usually focus on individual phenomena. In this paper, we propose a comprehensive robustness benchmark based on Spider, a cross-domain text-to-SQL benchmark, to diagnose the model robustness. We design 17 perturbations on databases, natural language questions, and SQL queries to measure the robustness from different angles. In order to collect more diversified natural question perturbations, we utilize large pretrained language models (PLMs) to simulate human behaviors in creating natural questions. We conduct a diagnostic study of the state-of-the-art models on the robustness set. Experimental results reveal that even the most robust model suffers from a 14.0% performance drop overall and a 50.7% performance drop on the most challenging perturbation. We also present a breakdown analysis regarding text-to-SQL model designs and provide insights for improving model robustness.

* ICLR 2023 
Viaarxiv icon

DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases

Sep 30, 2022
Donghan Yu, Sheng Zhang, Patrick Ng, Henghui Zhu, Alexander Hanbo Li, Jun Wang, Yiqun Hu, William Wang, Zhiguo Wang, Bing Xiang

Figure 1 for DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases
Figure 2 for DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases
Figure 3 for DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases
Figure 4 for DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases

Question answering over knowledge bases (KBs) aims to answer natural language questions with factual information such as entities and relations in KBs. Previous methods either generate logical forms that can be executed over KBs to obtain final answers or predict answers directly. Empirical results show that the former often produces more accurate answers, but it suffers from non-execution issues due to potential syntactic and semantic errors in the generated logical forms. In this work, we propose a novel framework DecAF that jointly generates both logical forms and direct answers, and then combines the merits of them to get the final answers. Moreover, different from most of the previous methods, DecAF is based on simple free-text retrieval without relying on any entity linking tools -- this simplification eases its adaptation to different datasets. DecAF achieves new state-of-the-art accuracy on WebQSP, FreebaseQA, and GrailQA benchmarks, while getting competitive results on the ComplexWebQuestions benchmark.

Viaarxiv icon

Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding

Sep 28, 2022
Jun Wang, Patrick Ng, Alexander Hanbo Li, Jiarong Jiang, Zhiguo Wang, Ramesh Nallapati, Bing Xiang, Sudipta Sengupta

Figure 1 for Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding
Figure 2 for Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding
Figure 3 for Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding
Figure 4 for Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding

Most recent research on Text-to-SQL semantic parsing relies on either parser itself or simple heuristic based approach to understand natural language query (NLQ). When synthesizing a SQL query, there is no explicit semantic information of NLQ available to the parser which leads to undesirable generalization performance. In addition, without lexical-level fine-grained query understanding, linking between query and database can only rely on fuzzy string match which leads to suboptimal performance in real applications. In view of this, in this paper we present a general-purpose, modular neural semantic parsing framework that is based on token-level fine-grained query understanding. Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural semantic parser (NSP). By jointly modeling query and database, NER model analyzes user intents and identifies entities in the query. NEL model links typed entities to schema and cell values in database. Parser model leverages available semantic information and linking results and synthesizes tree-structured SQL queries based on dynamically generated grammar. Experiments on SQUALL, a newly released semantic parsing dataset, show that we can achieve 56.8% execution accuracy on WikiTableQuestions (WTQ) test set, which outperforms the state-of-the-art model by 2.7%.

* EMNLP Industry Track 2022 
Viaarxiv icon

Learning to Selectively Learn for Weakly-supervised Paraphrase Generation

Sep 25, 2021
Kaize Ding, Dingcheng Li, Alexander Hanbo Li, Xing Fan, Chenlei Guo, Yang Liu, Huan Liu

Figure 1 for Learning to Selectively Learn for Weakly-supervised Paraphrase Generation
Figure 2 for Learning to Selectively Learn for Weakly-supervised Paraphrase Generation
Figure 3 for Learning to Selectively Learn for Weakly-supervised Paraphrase Generation
Figure 4 for Learning to Selectively Learn for Weakly-supervised Paraphrase Generation

Paraphrase generation is a longstanding NLP task that has diverse applications for downstream NLP tasks. However, the effectiveness of existing efforts predominantly relies on large amounts of golden labeled data. Though unsupervised endeavors have been proposed to address this issue, they may fail to generate meaningful paraphrases due to the lack of supervision signals. In this work, we go beyond the existing paradigms and propose a novel approach to generate high-quality paraphrases with weak supervision data. Specifically, we tackle the weakly-supervised paraphrase generation problem by: (1) obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion; and (2) developing a meta-learning framework to progressively select valuable samples for fine-tuning a pre-trained language model, i.e., BART, on the sentential paraphrasing task. We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.

* Accepted by EMNLP 2021 (long) 
Viaarxiv icon

Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering

Aug 05, 2021
Alexander Hanbo Li, Patrick Ng, Peng Xu, Henghui Zhu, Zhiguo Wang, Bing Xiang

Figure 1 for Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering
Figure 2 for Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering
Figure 3 for Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering
Figure 4 for Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering

The current state-of-the-art generative models for open-domain question answering (ODQA) have focused on generating direct answers from unstructured textual information. However, a large amount of world's knowledge is stored in structured databases, and need to be accessed using query languages such as SQL. Furthermore, query languages can answer questions that require complex reasoning, as well as offering full explainability. In this paper, we propose a hybrid framework that takes both textual and tabular evidence as input and generates either direct answers or SQL queries depending on which form could better answer the question. The generated SQL queries can then be executed on the associated databases to obtain the final answers. To the best of our knowledge, this is the first paper that applies Text2SQL to ODQA tasks. Empirically, we demonstrate that on several ODQA datasets, the hybrid methods consistently outperforms the baseline models that only take homogeneous input by a large margin. Specifically we achieve state-of-the-art performance on OpenSQuAD dataset using a T5-base model. In a detailed analysis, we demonstrate that the being able to generate structural SQL queries can always bring gains, especially for those questions that requires complex reasoning.

* ACL 2021 
Viaarxiv icon

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Dec 18, 2020
Peng Shi, Patrick Ng, Zhiguo Wang, Henghui Zhu, Alexander Hanbo Li, Jun Wang, Cicero Nogueira dos Santos, Bing Xiang

Figure 1 for Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
Figure 2 for Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
Figure 3 for Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
Figure 4 for Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train large neural language models with self-supervised learning objectives, such as Masked Language Model (MLM). However, based on a pilot study, we observe three issues of existing general-purpose language models when they are applied to text-to-SQL semantic parsers: fail to detect column mentions in the utterances, fail to infer column mentions from cell values, and fail to compose complex SQL queries. To mitigate these issues, we present a model pre-training framework, Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data. GAP MODEL is trained on 2M utterance-schema pairs and 30K utterance-schema-SQL triples, whose utterances are produced by generative models. Based on experimental results, neural semantic parsers that leverage GAP MODEL as a representation encoder obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-SQL benchmarks.

* Accepted to AAAI 2021 
Viaarxiv icon

Decomposed Adversarial Learned Inference

Apr 21, 2020
Alexander Hanbo Li, Yaqing Wang, Changyou Chen, Jing Gao

Figure 1 for Decomposed Adversarial Learned Inference
Figure 2 for Decomposed Adversarial Learned Inference

Effective inference for a generative adversarial model remains an important and challenging problem. We propose a novel approach, Decomposed Adversarial Learned Inference (DALI), which explicitly matches prior and conditional distributions in both data and code spaces, and puts a direct constraint on the dependency structure of the generative model. We derive an equivalent form of the prior and conditional matching objective that can be optimized efficiently without any parametric assumption on the data. We validate the effectiveness of DALI on the MNIST, CIFAR-10, and CelebA datasets by conducting quantitative and qualitative evaluations. Results demonstrate that DALI significantly improves both reconstruction and generation as compared to other adversarial inference models.

Viaarxiv icon