Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gholamreza Haffari

Monash University

Normalizing Flow-based Neural Process for Few-Shot Knowledge Graph Completion

Apr 17, 2023

Linhao Luo, Yuan-Fang Li, Gholamreza Haffari, Shirui Pan

Abstract:Knowledge graphs (KGs), as a structured form of knowledge representation, have been widely applied in the real world. Recently, few-shot knowledge graph completion (FKGC), which aims to predict missing facts for unseen relations with few-shot associated facts, has attracted increasing attention from practitioners and researchers. However, existing FKGC methods are based on metric learning or meta-learning, which often suffer from the out-of-distribution and overfitting problems. Meanwhile, they are incompetent at estimating uncertainties in predictions, which is critically important as model predictions could be very unreliable in few-shot settings. Furthermore, most of them cannot handle complex relations and ignore path information in KGs, which largely limits their performance. In this paper, we propose a normalizing flow-based neural process for few-shot knowledge graph completion (NP-FKGC). Specifically, we unify normalizing flows and neural processes to model a complex distribution of KG completion functions. This offers a novel way to predict facts for few-shot relations while estimating the uncertainty. Then, we propose a stochastic ManifoldE decoder to incorporate the neural process and handle complex relations in few-shot settings. To further improve performance, we introduce an attentive relation path-based graph neural network to capture path information in KGs. Extensive experiments on three public datasets demonstrate that our method significantly outperforms the existing FKGC methods and achieves state-of-the-art performance. Code is available at https://github.com/RManLuo/NP-FKGC.git.

* SIGIR 2023
* Accepted by SIGIR2023

Via

Access Paper or Ask Questions

Koala: An Index for Quantifying Overlaps with Pre-training Corpora

Mar 26, 2023

Thuy-Trang Vu, Xuanli He, Gholamreza Haffari, Ehsan Shareghi

Abstract:In very recent years more attention has been placed on probing the role of pre-training data in Large Language Models (LLMs) downstream behaviour. Despite the importance, there is no public tool that supports such analysis of pre-training corpora at large scale. To help research in this space, we launch Koala, a searchable index over large pre-training corpora using compressed suffix arrays with highly efficient compression rate and search support. In its first release we index the public proportion of OPT 175B pre-training data. Koala provides a framework to do forensic analysis on the current and future benchmarks as well as to assess the degree of memorization in the output from the LLMs. Koala is available for public use at https://koala-index.erc.monash.edu/.

* Available here: https://koala-index.erc.monash.edu/

Via

Access Paper or Ask Questions

ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning

Mar 22, 2023

Islam Nassar, Munawar Hayat, Ehsan Abbasnejad, Hamid Rezatofighi, Gholamreza Haffari

Figure 1 for ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning

Figure 2 for ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning

Figure 3 for ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning

Figure 4 for ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning

Abstract:Confidence-based pseudo-labeling is among the dominant approaches in semi-supervised learning (SSL). It relies on including high-confidence predictions made on unlabeled data as additional targets to train the model. We propose ProtoCon, a novel SSL method aimed at the less-explored label-scarce SSL where such methods usually underperform. ProtoCon refines the pseudo-labels by leveraging their nearest neighbours' information. The neighbours are identified as the training proceeds using an online clustering approach operating in an embedding space trained via a prototypical loss to encourage well-formed clusters. The online nature of ProtoCon allows it to utilise the label history of the entire dataset in one training cycle to refine labels in the following cycle without the need to store image embeddings. Hence, it can seamlessly scale to larger datasets at a low cost. Finally, ProtoCon addresses the poor training signal in the initial phase of training (due to fewer confident predictions) by introducing an auxiliary self-supervised loss. It delivers significant gains and faster convergence over state-of-the-art across 5 datasets, including CIFARs, ImageNet and DomainNet.

* Accepted in CVPR2023 (highlight)

Via

Access Paper or Ask Questions

Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue Response Generation Models by Causal Discovery

Mar 02, 2023

Tao Feng, Lizhen Qu, Gholamreza Haffari

Abstract:In this paper, we conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work. The cur rent models indeed suffer from spurious correlations and have a tendency of generating irrelevant and generic responses. Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model using a conditional independence classifier. The classifier is trained by a constrained self-training method, coined CONSTRAIN, to overcome data scarcity. The experimental results based on both human and automatic evaluation show that our method significantly outperforms the competitive baselines in terms of relevance, informativeness, and fluency.

Via

Access Paper or Ask Questions

Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation

Feb 16, 2023

Minghao Wu, George Foster, Lizhen Qu, Gholamreza Haffari

Abstract:Existing work in document-level neural machine translation commonly concatenates several consecutive sentences as a pseudo-document, and then learns inter-sentential dependencies. This strategy limits the model's ability to leverage information from distant context. We overcome this limitation with a novel Document Flattening (DocFlat) technique that integrates Flat-Batch Attention (FBA) and Neural Context Gate (NCG) into Transformer model to utilize information beyond the pseudo-document boundaries. FBA allows the model to attend to all the positions in the batch and learns the relationships between positions explicitly and NCG identifies the useful information from the distant context. We conduct comprehensive experiments and analyses on three benchmark datasets for English-German translation, and validate the effectiveness of two variants of DocFlat. Empirical results show that our approach outperforms strong baselines with statistical significance on BLEU, COMET and accuracy on the contrastive test set. The analyses highlight that DocFlat is highly effective in capturing the long-range information.

* 15 pages, 8 figures, accepted by EACL 2023

Via

Access Paper or Ask Questions

On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

Feb 06, 2023

Terry Yue Zhuo, Zhuang Li, Yujin Huang, Fatemeh Shiri, Weiqing Wang, Gholamreza Haffari, Yuan-Fang Li

Figure 1 for On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

Figure 2 for On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

Figure 3 for On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

Figure 4 for On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex

Abstract:Semantic parsing is a technique aimed at constructing a structured representation of the meaning of a natural-language question. Recent advancements in few-shot language models trained on code have demonstrated superior performance in generating these representations compared to traditional unimodal language models, which are trained on downstream tasks. Despite these advancements, existing fine-tuned neural semantic parsers are susceptible to adversarial attacks on natural-language inputs. While it has been established that the robustness of smaller semantic parsers can be enhanced through adversarial training, this approach is not feasible for large language models in real-world scenarios, as it requires both substantial computational resources and expensive human annotation on in-domain semantic parsing data. This paper presents the first empirical study on the adversarial robustness of a large prompt-based language model of code, \codex. Our results demonstrate that the state-of-the-art (SOTA) code-language models are vulnerable to carefully crafted adversarial examples. To address this challenge, we propose methods for improving robustness without the need for significant amounts of labeled data or heavy computational resources.

* Accepted at EACL2023 (main)

Via

Access Paper or Ask Questions

Active Learning for Multilingual Semantic Parser

Feb 04, 2023

Zhuang Li, Gholamreza Haffari

Abstract:Current multilingual semantic parsing (MSP) datasets are almost all collected by translating the utterances in the existing datasets from the resource-rich language to the target language. However, manual translation is costly. To reduce the translation effort, this paper proposes the first active learning procedure for MSP (AL-MSP). AL-MSP selects only a subset from the existing datasets to be translated. We also propose a novel selection method that prioritizes the examples diversifying the logical form structures with more lexical choices, and a novel hyperparameter tuning method that needs no extra annotation cost. Our experiments show that AL-MSP significantly reduces translation costs with ideal selection methods. Our selection method with proper hyperparameters yields better parsing performance than the other baselines on two multilingual datasets.

* EACL 2023 (findings)

Via

Access Paper or Ask Questions

Let's Negotiate! A Survey of Negotiation Dialogue Systems

Dec 18, 2022

Haolan Zhan, Yufei Wang, Tao Feng, Yuncheng Hua, Suraj Sharma, Zhuang Li, Lizhen Qu, Gholamreza Haffari

Figure 1 for Let's Negotiate! A Survey of Negotiation Dialogue Systems

Figure 2 for Let's Negotiate! A Survey of Negotiation Dialogue Systems

Figure 3 for Let's Negotiate! A Survey of Negotiation Dialogue Systems

Figure 4 for Let's Negotiate! A Survey of Negotiation Dialogue Systems

Abstract:Negotiation is one of the crucial abilities in human communication, and there has been a resurgent research interest in negotiation dialogue systems recently, which goal is to empower intelligent agents with such ability that can efficiently help humans resolve conflicts or reach beneficial agreements. Although there have been many explorations in negotiation dialogue systems, a systematic review of this task has to date remained notably absent. To this end, we aim to fill this gap by reviewing contemporary studies in the emerging field of negotiation dialogue systems, covering benchmarks, evaluations, and methodologies. Furthermore, we also discuss potential future directions, including multi-modal, multi-party, and cross-cultural negotiation scenarios. Our goal is to provide the community with a systematic overview of negotiation dialogue systems and to inspire future research.

* An early version, work in progress

Via

Access Paper or Ask Questions

Learning Object-Language Alignments for Open-Vocabulary Object Detection

Nov 27, 2022

Chuang Lin, Peize Sun, Yi Jiang, Ping Luo, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan, Jianfei Cai

Abstract:Existing object detection methods are bounded in a fixed-set vocabulary by costly labeled data. When dealing with novel categories, the model has to be retrained with more bounding box annotations. Natural language supervision is an attractive alternative for its annotation-free attributes and broader object concepts. However, learning open-vocabulary object detection from language is challenging since image-text pairs do not contain fine-grained object-language alignments. Previous solutions rely on either expensive grounding annotations or distilling classification-oriented vision models. In this paper, we propose a novel open-vocabulary object detection framework directly learning from image-text pair data. We formulate object-language alignment as a set matching problem between a set of image region features and a set of word embeddings. It enables us to train an open-vocabulary object detector on image-text pairs in a much simple and effective way. Extensive experiments on two benchmark datasets, COCO and LVIS, demonstrate our superior performance over the competing approaches on novel categories, e.g. achieving 32.0% mAP on COCO and 21.7% mask mAP on LVIS. Code is available at: https://github.com/clin1223/VLDet.

* Technical Report

Via

Access Paper or Ask Questions

Complex Reading Comprehension Through Question Decomposition

Nov 07, 2022

Xiao-Yu Guo, Yuan-Fang Li, Gholamreza Haffari

Figure 1 for Complex Reading Comprehension Through Question Decomposition

Figure 2 for Complex Reading Comprehension Through Question Decomposition

Figure 3 for Complex Reading Comprehension Through Question Decomposition

Figure 4 for Complex Reading Comprehension Through Question Decomposition

Abstract:Multi-hop reading comprehension requires not only the ability to reason over raw text but also the ability to combine multiple evidence. We propose a novel learning approach that helps language models better understand difficult multi-hop questions and perform "complex, compositional" reasoning. Our model first learns to decompose each multi-hop question into several sub-questions by a trainable question decomposer. Instead of answering these sub-questions, we directly concatenate them with the original question and context, and leverage a reading comprehension model to predict the answer in a sequence-to-sequence manner. By using the same language model for these two components, our best seperate/unified t5-base variants outperform the baseline by 7.2/6.1 absolute F1 points on a hard subset of DROP dataset.

* 10 pages, 1 figure, accepted at ALTA 2022

Via

Access Paper or Ask Questions