Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Avishek Anand

Learnt Sparsity for Effective and Interpretable Document Ranking

Jun 23, 2021
Jurek Leonhardt, Koustav Rudra, Avishek Anand

Figure 1 for Learnt Sparsity for Effective and Interpretable Document Ranking

Figure 2 for Learnt Sparsity for Effective and Interpretable Document Ranking

Figure 3 for Learnt Sparsity for Effective and Interpretable Document Ranking

Figure 4 for Learnt Sparsity for Effective and Interpretable Document Ranking

Machine learning models for the ad-hoc retrieval of documents and passages have recently shown impressive improvements due to better language understanding using large pre-trained language models. However, these over-parameterized models are inherently non-interpretable and do not provide any information on the parts of the documents that were used to arrive at a certain prediction. In this paper we introduce the select and rank paradigm for document ranking, where interpretability is explicitly ensured when scoring longer documents. Specifically, we first select sentences in a document based on the input query and then predict the query-document score based only on the selected sentences, acting as an explanation. We treat sentence selection as a latent variable trained jointly with the ranker from the final output. We conduct extensive experiments to demonstrate that our inherently interpretable select-and-rank approach is competitive in comparison to other state-of-the-art methods and sometimes even outperforms them. This is due to our novel end-to-end training approach based on weighted reservoir sampling that manages to train the selector despite the stochastic sentence selection. We also show that our sentence selection approach can be used to provide explanations for models that operate on only parts of the document, such as BERT.

Via

Access Paper or Ask Questions

Towards Axiomatic Explanations for Neural Ranking Models

Jun 15, 2021
Michael Völske, Alexander Bondarenko, Maik Fröbe, Matthias Hagen, Benno Stein, Jaspreet Singh, Avishek Anand

Figure 1 for Towards Axiomatic Explanations for Neural Ranking Models

Figure 2 for Towards Axiomatic Explanations for Neural Ranking Models

Figure 3 for Towards Axiomatic Explanations for Neural Ranking Models

Figure 4 for Towards Axiomatic Explanations for Neural Ranking Models

Recently, neural networks have been successfully employed to improve upon state-of-the-art performance in ad-hoc retrieval tasks via machine-learned ranking functions. While neural retrieval models grow in complexity and impact, little is understood about their correspondence with well-studied IR principles. Recent work on interpretability in machine learning has provided tools and techniques to understand neural models in general, yet there has been little progress towards explaining ranking models. We investigate whether one can explain the behavior of neural ranking models in terms of their congruence with well understood principles of document ranking by using established theories from axiomatic IR. Axiomatic analysis of information retrieval models has formalized a set of constraints on ranking decisions that reasonable retrieval models should fulfill. We operationalize this axiomatic thinking to reproduce rankings based on combinations of elementary constraints. This allows us to investigate to what extent the ranking decisions of neural rankers can be explained in terms of retrieval axioms, and which axioms apply in which situations. Our experimental study considers a comprehensive set of axioms over several representative neural rankers. While the existing axioms can already explain the particularly confident ranking decisions rather well, future work should extend the axiom set to also cover the other still "unexplainable" neural IR rank decisions.

* 11 pages, 2 figures. To be published in the proceedings of ICTIR 2021

Via

Access Paper or Ask Questions

Exploiting Sentence-Level Representations for Passage Ranking

Jun 14, 2021
Jurek Leonhardt, Fabian Beringer, Avishek Anand

Figure 1 for Exploiting Sentence-Level Representations for Passage Ranking

Figure 2 for Exploiting Sentence-Level Representations for Passage Ranking

Figure 3 for Exploiting Sentence-Level Representations for Passage Ranking

Figure 4 for Exploiting Sentence-Level Representations for Passage Ranking

Recently, pre-trained contextual models, such as BERT, have shown to perform well in language related tasks. We revisit the design decisions that govern the applicability of these models for the passage re-ranking task in open-domain question answering. We find that common approaches in the literature rely on fine-tuning a pre-trained BERT model and using a single, global representation of the input, discarding useful fine-grained relevance signals in token- or sentence-level representations. We argue that these discarded tokens hold useful information that can be leveraged. In this paper, we explicitly model the sentence-level representations by using Dynamic Memory Networks (DMNs) and conduct empirical evaluation to show improvements in passage re-ranking over fine-tuned vanilla BERT models by memory-enhanced explicit sentence modelling on a diverse set of open-domain QA datasets. We further show that freezing the BERT model and only training the DMN layer still comes close to the original performance, while improving training efficiency drastically. This indicates that the usual fine-tuning step mostly helps to aggregate the inherent information in a single output token, as opposed to adapting the whole model to the new task, and only achieves rather small gains.

Via

Access Paper or Ask Questions

BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Jun 05, 2021
Jonas Wallat, Jaspreet Singh, Avishek Anand

Figure 1 for BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Figure 2 for BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Figure 3 for BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Figure 4 for BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Probing complex language models has recently revealed several insights into linguistic and semantic patterns found in the learned representations. In this article, we probe BERT specifically to understand and measure the relational knowledge it captures in its parametric memory. While probing for linguistic understanding is commonly applied to all layers of BERT as well as fine-tuned models, this has not been done for factual knowledge. We utilize existing knowledge base completion tasks (LAMA) to probe every layer of pre-trained as well as fine-tuned BERT models(ranking, question answering, NER). Our findings show that knowledge is not just contained in BERT's final layers. Intermediate layers contribute a significant amount (17-60%) to the total knowledge found. Probing intermediate layers also reveals how different types of knowledge emerge at varying rates. When BERT is fine-tuned, relational knowledge is forgotten. The extent of forgetting is impacted by the fine-tuning objective and the training data. We found that ranking models forget the least and retain more knowledge in their final layer compared to masked language modeling and question-answering. However, masked language modeling performed the best at acquiring new knowledge from the training data. When it comes to learning facts, we found that capacity and fact density are key factors. We hope this initial work will spur further research into understanding the parametric memory of language models and the effect of training objectives on factual knowledge. The code to repeat the experiments is publicly available on GitHub.

* arXiv admin note: substantial text overlap with arXiv:2010.09313

Via

Access Paper or Ask Questions

Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks

May 18, 2021
Thorben Funke, Megha Khosla, Avishek Anand

Figure 1 for Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks

Figure 2 for Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks

Figure 3 for Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks

Figure 4 for Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks

With the ever-increasing popularity and applications of graph neural networks, several proposals have been made to interpret and understand the decisions of a GNN model. Explanations for a GNN model differ in principle from other input settings. It is important to attribute the decision to input features and other related instances connected by the graph structure. We find that the previous explanation generation approaches that maximize the mutual information between the label distribution produced by the GNN model and the explanation to be restrictive. Specifically, existing approaches do not enforce explanations to be predictive, sparse, or robust to input perturbations. In this paper, we lay down some of the fundamental principles that an explanation method for GNNs should follow and introduce a metric fidelity as a measure of the explanation's effectiveness. We propose a novel approach Zorro based on the principles from rate-distortion theory that uses a simple combinatorial procedure to optimize for fidelity. Extensive experiments on real and synthetic datasets reveal that Zorro produces sparser, stable, and more faithful explanations than existing GNN explanation approaches.

Via

Access Paper or Ask Questions

Towards Benchmarking the Utility of Explanations for Model Debugging

May 10, 2021
Maximilian Idahl, Lijun Lyu, Ujwal Gadiraju, Avishek Anand

Figure 1 for Towards Benchmarking the Utility of Explanations for Model Debugging

Post-hoc explanation methods are an important class of approaches that help understand the rationale underlying a trained model's decision. But how useful are they for an end-user towards accomplishing a given task? In this vision paper, we argue the need for a benchmark to facilitate evaluations of the utility of post-hoc explanation methods. As a first step to this end, we enumerate desirable properties that such a benchmark should possess for the task of debugging text classifiers. Additionally, we highlight that such a benchmark facilitates not only assessing the effectiveness of explanations but also their efficiency.

* Short paper, to appear at TrustNLP @ NAACL 2021

Via

Access Paper or Ask Questions

An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking

Mar 30, 2021
Koustav Rudra, Zeon Trevor Fernando, Avishek Anand

Figure 1 for An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking

Figure 2 for An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking

Figure 3 for An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking

Figure 4 for An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking

Recently introduced pre-trained contextualized autoregressive models like BERT have shown improvements in document retrieval tasks. One of the major limitations of the current approaches can be attributed to the manner they deal with variable-size document lengths using a fixed input BERT model. Common approaches either truncate or split longer documents into small sentences/passages and subsequently label them - using the original document label or from another externally trained model. In this paper, we conduct a detailed study of the design decisions about splitting and label transfer on retrieval effectiveness and efficiency. We find that direct transfer of relevance labels from documents to passages introduces label noise that strongly affects retrieval effectiveness for large training datasets. We also find that query processing times are adversely affected by fine-grained splitting schemes. As a remedy, we propose a careful passage level labelling scheme using weak supervision that delivers improved performance (3-14% in terms of nDCG score) over most of the recently proposed models for ad-hoc retrieval while maintaining manageable computational complexity on four diverse document retrieval datasets.

* Paper is about the performance analysis of contextual ranking strategies in an ad-hoc document retrieval

Via

Access Paper or Ask Questions

Explain and Predict, and then Predict Again

Feb 04, 2021
Zijian Zhang, Koustav Rudra, Avishek Anand

Figure 1 for Explain and Predict, and then Predict Again

Figure 2 for Explain and Predict, and then Predict Again

Figure 3 for Explain and Predict, and then Predict Again

Figure 4 for Explain and Predict, and then Predict Again

A desirable property of learning systems is to be both effective and interpretable. Towards this goal, recent models have been proposed that first generate an extractive explanation from the input text and then generate a prediction on just the explanation called explain-then-predict models. These models primarily consider the task input as a supervision signal in learning an extractive explanation and do not effectively integrate rationales data as an additional inductive bias to improve task performance. We propose a novel yet simple approach ExPred, that uses multi-task learning in the explanation generation phase effectively trading-off explanation and prediction losses. And then we use another prediction network on just the extracted explanations for optimizing the task performance. We conduct an extensive evaluation of our approach on three diverse language datasets -- fact verification, sentiment classification, and QA -- and find that we substantially outperform existing approaches.

* Accepted in the WSDM 2021

Via

Access Paper or Ask Questions

Dissonance Between Human and Machine Understanding

Jan 18, 2021
Zijian Zhang, Jaspreet Singh, Ujwal Gadiraju, Avishek Anand

Figure 1 for Dissonance Between Human and Machine Understanding

Figure 2 for Dissonance Between Human and Machine Understanding

Figure 3 for Dissonance Between Human and Machine Understanding

Figure 4 for Dissonance Between Human and Machine Understanding

Complex machine learning models are deployed in several critical domains including healthcare and autonomous vehicles nowadays, albeit as functional black boxes. Consequently, there has been a recent surge in interpreting decisions of such complex models in order to explain their actions to humans. Models that correspond to human interpretation of a task are more desirable in certain contexts and can help attribute liability, build trust, expose biases and in turn build better models. It is, therefore, crucial to understand how and which models conform to human understanding of tasks. In this paper, we present a large-scale crowdsourcing study that reveals and quantifies the dissonance between human and machine understanding, through the lens of an image classification task. In particular, we seek to answer the following questions: Which (well-performing) complex ML models are closer to humans in their use of features to make accurate predictions? How does task difficulty affect the feature selection capability of machines in comparison to humans? Are humans consistently better at selecting features that make image recognition more accurate? Our findings have important implications on human-machine collaboration, considering that a long term goal in the field of artificial intelligence is to make machines capable of learning and reasoning like humans.

* [J]. Proceedings of the ACM on Human-Computer Interaction, 2019, 3(CSCW): 1-23
* 23 pages, 5 figures

Via

Access Paper or Ask Questions

Explain and Predict, and then Predict again

Jan 11, 2021
Zijian Zhang, Koustav Rudra, Avishek Anand

* Accepted in the WSDM 2021 and the camera-ready version will be there soon

Via

Access Paper or Ask Questions