Alert button
Picture for Chen Qu

Chen Qu

Alert button

Exploring Dual Encoder Architectures for Question Answering

Apr 14, 2022
Zhe Dong, Jianmo Ni, Dan Bikel, Enrique Alfonseca, Yuan Wang, Chen Qu, Imed Zitouni

Figure 1 for Exploring Dual Encoder Architectures for Question Answering
Figure 2 for Exploring Dual Encoder Architectures for Question Answering
Figure 3 for Exploring Dual Encoder Architectures for Question Answering
Figure 4 for Exploring Dual Encoder Architectures for Question Answering

Dual encoders have been used for question-answering (QA) and information retrieval (IR) tasks with good results. There are two major types of dual encoders, Siamese Dual Encoders (SDE), with parameters shared across two encoders, and Asymmetric Dual Encoder (ADE), with two distinctly parameterized encoders. In this work, we explore the dual encoder architectures for QA retrieval tasks. By evaluating on MS MARCO and the MultiReQA benchmark, we show that SDE performs significantly better than ADE. We further propose three different improved versions of ADEs. Based on the evaluation of QA retrieval tasks and direct analysis of the embeddings, we demonstrate that sharing parameters in projection layers would enable ADEs to perform competitively with SDEs.

Viaarxiv icon

Large Dual Encoders Are Generalizable Retrievers

Dec 15, 2021
Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernández Ábrego, Ji Ma, Vincent Y. Zhao, Yi Luan, Keith B. Hall, Ming-Wei Chang, Yinfei Yang

Figure 1 for Large Dual Encoders Are Generalizable Retrievers
Figure 2 for Large Dual Encoders Are Generalizable Retrievers
Figure 3 for Large Dual Encoders Are Generalizable Retrievers
Figure 4 for Large Dual Encoders Are Generalizable Retrievers

It has been shown that dual encoders trained on one domain often fail to generalize to other domains for retrieval tasks. One widespread belief is that the bottleneck layer of a dual encoder, where the final score is simply a dot-product between a query vector and a passage vector, is too limited to make dual encoders an effective retrieval model for out-of-domain generalization. In this paper, we challenge this belief by scaling up the size of the dual encoder model {\em while keeping the bottleneck embedding size fixed.} With multi-stage training, surprisingly, scaling up the model size brings significant improvement on a variety of retrieval tasks, especially for out-of-domain generalization. Experimental results show that our dual encoders, \textbf{G}eneralizable \textbf{T}5-based dense \textbf{R}etrievers (GTR), outperform %ColBERT~\cite{khattab2020colbert} and existing sparse and dense retrievers on the BEIR dataset~\cite{thakur2021beir} significantly. Most surprisingly, our ablation study finds that GTR is very data efficient, as it only needs 10\% of MS Marco supervised data to achieve the best out-of-domain performance. All the GTR models are released at https://tfhub.dev/google/collections/gtr/1.

Viaarxiv icon

Passage Retrieval for Outside-Knowledge Visual Question Answering

May 09, 2021
Chen Qu, Hamed Zamani, Liu Yang, W. Bruce Croft, Erik Learned-Miller

Figure 1 for Passage Retrieval for Outside-Knowledge Visual Question Answering
Figure 2 for Passage Retrieval for Outside-Knowledge Visual Question Answering
Figure 3 for Passage Retrieval for Outside-Knowledge Visual Question Answering
Figure 4 for Passage Retrieval for Outside-Knowledge Visual Question Answering

In this work, we address multi-modal information needs that contain text questions and images by focusing on passage retrieval for outside-knowledge visual question answering. This task requires access to outside knowledge, which in our case we define to be a large unstructured passage collection. We first conduct sparse retrieval with BM25 and study expanding the question with object names and image captions. We verify that visual clues play an important role and captions tend to be more informative than object names in sparse retrieval. We then construct a dual-encoder dense retriever, with the query encoder being LXMERT, a multi-modal pre-trained transformer. We further show that dense retrieval significantly outperforms sparse retrieval that uses object expansion. Moreover, dense retrieval matches the performance of sparse retrieval that leverages human-generated captions.

* Accepted to SIGIR'21 as a short paper 
Viaarxiv icon

Privacy-Adaptive BERT for Natural Language Understanding

Apr 15, 2021
Chen Qu, Weize Kong, Liu Yang, Mingyang Zhang, Michael Bendersky, Marc Najork

Figure 1 for Privacy-Adaptive BERT for Natural Language Understanding
Figure 2 for Privacy-Adaptive BERT for Natural Language Understanding
Figure 3 for Privacy-Adaptive BERT for Natural Language Understanding
Figure 4 for Privacy-Adaptive BERT for Natural Language Understanding

When trying to apply the recent advance of Natural Language Understanding (NLU) technologies to real-world applications, privacy preservation imposes a crucial challenge, which, unfortunately, has not been well resolved. To address this issue, we study how to improve the effectiveness of NLU models under a Local Privacy setting, using BERT, a widely-used pretrained Language Model (LM), as an example. We systematically study the strengths and weaknesses of imposing dx-privacy, a relaxed variant of Local Differential Privacy, at different stages of language modeling: input text, token embeddings, and sequence representations. We then focus on the former two with privacy-constrained fine-tuning experiments to reveal the utility of BERT under local privacy constraints. More importantly, to the best of our knowledge, we are the first to propose privacy-adaptive LM pretraining methods and demonstrate that they can significantly improve model performance on privatized text input. We also interpret the level of privacy preservation and provide our guidance on privacy parameter selections.

Viaarxiv icon

Weakly-Supervised Open-Retrieval Conversational Question Answering

Mar 03, 2021
Chen Qu, Liu Yang, Cen Chen, W. Bruce Croft, Kalpesh Krishna, Mohit Iyyer

Figure 1 for Weakly-Supervised Open-Retrieval Conversational Question Answering
Figure 2 for Weakly-Supervised Open-Retrieval Conversational Question Answering
Figure 3 for Weakly-Supervised Open-Retrieval Conversational Question Answering
Figure 4 for Weakly-Supervised Open-Retrieval Conversational Question Answering

Recent studies on Question Answering (QA) and Conversational QA (ConvQA) emphasize the role of retrieval: a system first retrieves evidence from a large collection and then extracts answers. This open-retrieval ConvQA setting typically assumes that each question is answerable by a single span of text within a particular passage (a span answer). The supervision signal is thus derived from whether or not the system can recover an exact match of this ground-truth answer span from the retrieved passages. This method is referred to as span-match weak supervision. However, information-seeking conversations are challenging for this span-match method since long answers, especially freeform answers, are not necessarily strict spans of any passage. Therefore, we introduce a learned weak supervision approach that can identify a paraphrased span of the known answer in a passage. Our experiments on QuAC and CoQA datasets show that the span-match weak supervisor can only handle conversations with span answers, and has less satisfactory results for freeform answers generated by people. Our method is more flexible as it can handle both span answers and freeform answers. Moreover, our method can be more powerful when combined with the span-match method which shows it is complementary to the span-match method. We also conduct in-depth analyses to show more insights on open-retrieval ConvQA under a weak supervision setting.

* Accepted to ECIR'21 
Viaarxiv icon

Open-Retrieval Conversational Question Answering

May 22, 2020
Chen Qu, Liu Yang, Cen Chen, Minghui Qiu, W. Bruce Croft, Mohit Iyyer

Figure 1 for Open-Retrieval Conversational Question Answering
Figure 2 for Open-Retrieval Conversational Question Answering
Figure 3 for Open-Retrieval Conversational Question Answering
Figure 4 for Open-Retrieval Conversational Question Answering

Conversational search is one of the ultimate goals of information retrieval. Recent research approaches conversational search by simplified settings of response ranking and conversational question answering, where an answer is either selected from a given candidate set or extracted from a given passage. These simplifications neglect the fundamental role of retrieval in conversational search. To address this limitation, we introduce an open-retrieval conversational question answering (ORConvQA) setting, where we learn to retrieve evidence from a large collection before extracting answers, as a further step towards building functional conversational search systems. We create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader that are all based on Transformers. Our extensive experiments on OR-QuAC demonstrate that a learnable retriever is crucial for ORConvQA. We further show that our system can make a substantial improvement when we enable history modeling in all system components. Moreover, we show that the reranker component contributes to the model performance by providing a regularization effect. Finally, further in-depth analyses are performed to provide new insights into ORConvQA.

* Accepted to SIGIR'20 
Viaarxiv icon

IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems

Feb 03, 2020
Liu Yang, Minghui Qiu, Chen Qu, Cen Chen, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Haiqing Chen

Figure 1 for IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
Figure 2 for IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
Figure 3 for IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
Figure 4 for IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems

Personal assistant systems, such as Apple Siri, Google Assistant, Amazon Alexa, and Microsoft Cortana, are becoming ever more widely used. Understanding user intent such as clarification questions, potential answers and user feedback in information-seeking conversations is critical for retrieving good responses. In this paper, we analyze user intent patterns in information-seeking conversations and propose an intent-aware neural response ranking model "IART", which refers to "Intent-Aware Ranking with Transformers". IART is built on top of the integration of user intent modeling and language representation learning with the Transformer architecture, which relies entirely on a self-attention mechanism instead of recurrent nets. It incorporates intent-aware utterance attention to derive an importance weighting scheme of utterances in conversation context with the aim of better conversation history understanding. We conduct extensive experiments with three information-seeking conversation data sets including both standard benchmarks and commercial data. Our proposed model outperforms all baseline methods with respect to a variety of metrics. We also perform case studies and analysis of learned user intent and its impact on response ranking in information-seeking conversations to provide interpretation of results.

* Accepted by WWW2020 
Viaarxiv icon

A Hybrid Retrieval-Generation Neural Conversation Model

Apr 19, 2019
Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, Jingjing Liu

Figure 1 for A Hybrid Retrieval-Generation Neural Conversation Model
Figure 2 for A Hybrid Retrieval-Generation Neural Conversation Model
Figure 3 for A Hybrid Retrieval-Generation Neural Conversation Model
Figure 4 for A Hybrid Retrieval-Generation Neural Conversation Model

Intelligent personal assistant systems, with either text-based or voice-based conversational interfaces, are becoming increasingly popular. Most previous research has used either retrieval-based or generation-based methods. Retrieval-based methods have the advantage of returning fluent and informative responses with great diversity. The retrieved responses are easier to control and explain. However, the response retrieval performance is limited by the size of the response repository. On the other hand, although generation-based methods can return highly coherent responses given conversation context, they are likely to return universal or general responses with insufficient ground knowledge information. In this paper, we build a hybrid neural conversation model with the capability of both response retrieval and generation, in order to combine the merits of these two types of methods. Experimental results on Twitter and Foursquare data show that the proposed model can outperform both retrieval-based methods and generation-based methods (including a recently proposed knowledge-grounded neural conversation model) under both automatic evaluation metrics and human evaluation. Our models and research findings provide new insights on how to integrate text retrieval and text generation models for building conversation systems.

* 11 pages 
Viaarxiv icon

Answer Interaction in Non-factoid Question Answering Systems

Jan 15, 2019
Chen Qu, Liu Yang, Bruce Croft, Falk Scholer, Yongfeng Zhang

Figure 1 for Answer Interaction in Non-factoid Question Answering Systems
Figure 2 for Answer Interaction in Non-factoid Question Answering Systems
Figure 3 for Answer Interaction in Non-factoid Question Answering Systems
Figure 4 for Answer Interaction in Non-factoid Question Answering Systems

Information retrieval systems are evolving from document retrieval to answer retrieval. Web search logs provide large amounts of data about how people interact with ranked lists of documents, but very little is known about interaction with answer texts. In this paper, we use Amazon Mechanical Turk to investigate three answer presentation and interaction approaches in a non-factoid question answering setting. We find that people perceive and react to good and bad answers very differently, and can identify good answers relatively quickly. Our results provide the basis for further investigation of effective answer interaction and feedback methods.

* Accepted to CHIIR 2019 
Viaarxiv icon

User Intent Prediction in Information-seeking Conversations

Jan 11, 2019
Chen Qu, Liu Yang, Bruce Croft, Yongfeng Zhang, Johanne R. Trippas, Minghui Qiu

Figure 1 for User Intent Prediction in Information-seeking Conversations
Figure 2 for User Intent Prediction in Information-seeking Conversations
Figure 3 for User Intent Prediction in Information-seeking Conversations
Figure 4 for User Intent Prediction in Information-seeking Conversations

Conversational assistants are being progressively adopted by the general population. However, they are not capable of handling complicated information-seeking tasks that involve multiple turns of information exchange. Due to the limited communication bandwidth in conversational search, it is important for conversational assistants to accurately detect and predict user intent in information-seeking conversations. In this paper, we investigate two aspects of user intent prediction in an information-seeking setting. First, we extract features based on the content, structural, and sentiment characteristics of a given utterance, and use classic machine learning methods to perform user intent prediction. We then conduct an in-depth feature importance analysis to identify key features in this prediction task. We find that structural features contribute most to the prediction performance. Given this finding, we construct neural classifiers to incorporate context information and achieve better performance without feature engineering. Our findings can provide insights into the important factors and effective methods of user intent prediction in information-seeking conversations.

* Accepted to CHIIR 2019 
Viaarxiv icon