Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tsendsuren Munkhdalai

A Locally Adaptive Interpretable Regression

May 14, 2020
Lkhagvadorj Munkhdalai, Tsendsuren Munkhdalai, Keun Ho Ryu

Figure 1 for A Locally Adaptive Interpretable Regression

Figure 2 for A Locally Adaptive Interpretable Regression

Figure 3 for A Locally Adaptive Interpretable Regression

Figure 4 for A Locally Adaptive Interpretable Regression

Machine learning models with both good predictability and high interpretability are crucial for decision support systems. Linear regression is one of the most interpretable prediction models. However, the linearity in a simple linear regression worsens its predictability. In this work, we introduce a locally adaptive interpretable regression (LoAIR). In LoAIR, a metamodel parameterized by neural networks predicts percentile of a Gaussian distribution for the regression coefficients for a rapid adaptation. Our experimental results on public benchmark datasets show that our model not only achieves comparable or better predictive performance than the other state-of-the-art baselines but also discovers some interesting relationships between input and target variables such as a parabolic relationship between CO2 emissions and Gross National Product (GNP). Therefore, LoAIR is a step towards bridging the gap between econometrics, statistics, and machine learning by improving the predictive ability of linear regression without depreciating its interpretability.

Via

Access Paper or Ask Questions

Exploring and Predicting Transferability across NLP Tasks

May 02, 2020
Tu Vu, Tong Wang, Tsendsuren Munkhdalai, Alessandro Sordoni, Adam Trischler, Andrew Mattarella-Micke, Subhransu Maji, Mohit Iyyer

Figure 1 for Exploring and Predicting Transferability across NLP Tasks

Figure 2 for Exploring and Predicting Transferability across NLP Tasks

Figure 3 for Exploring and Predicting Transferability across NLP Tasks

Figure 4 for Exploring and Predicting Transferability across NLP Tasks

Recent advances in NLP demonstrate the effectiveness of training large-scale language models and transferring them to downstream tasks. Can fine-tuning these models on tasks other than language modeling further improve performance? In this paper, we conduct an extensive study of the transferability between 33 NLP tasks across three broad classes of problems (text classification, question answering, and sequence labeling). Our results show that transfer learning is more beneficial than previously thought, especially when target task data is scarce, and can improve performance even when the source task is small or differs substantially from the target task (e.g., part-of-speech tagging transfers well to the DROP QA dataset). We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task, and we validate their effectiveness in experiments controlled for source and target data size. Overall, our experiments reveal that factors such as source data size, task and domain similarity, and task complexity all play a role in determining transferability.

* Preprint, 44 pages, 3 figures, 33 tables

Via

Access Paper or Ask Questions

Metalearned Neural Memory

Jul 23, 2019
Tsendsuren Munkhdalai, Alessandro Sordoni, Tong Wang, Adam Trischler

We augment recurrent neural networks with an external memory mechanism that builds upon recent progress in metalearning. We conceptualize this memory as a rapidly adaptable function that we parameterize as a deep neural network. Reading from the neural memory function amounts to pushing an input (the key vector) through the function to produce an output (the value vector). Writing to memory means changing the function; specifically, updating the parameters of the neural network to encode desired information. We leverage training and algorithmic techniques from metalearning to update the neural memory function in one shot. The proposed memory-augmented model achieves strong performance on a variety of learning problems, from supervised question answering to reinforcement learning.

* 24 pages

Via

Access Paper or Ask Questions

Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension

Oct 12, 2018
Rajarshi Das, Tsendsuren Munkhdalai, Xingdi Yuan, Adam Trischler, Andrew McCallum

Figure 1 for Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension

Figure 2 for Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension

Figure 3 for Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension

Figure 4 for Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension

We propose a neural machine-reading model that constructs dynamic knowledge graphs from procedural text. It builds these graphs recurrently for each step of the described procedure, and uses them to track the evolving states of participant entities. We harness and extend a recently proposed machine reading comprehension (MRC) model to query for entity states, since these states are generally communicated in spans of text and MRC models perform well in extracting entity-centric spans. The explicit, structured, and evolving knowledge graph representations that our model constructs can be used in downstream question answering tasks to improve machine comprehension of text, as we demonstrate empirically. On two comprehension tasks from the recently proposed PROPARA dataset (Dalvi et al., 2018), our model achieves state-of-the-art results. We further show that our model is competitive on the RECIPES dataset (Kiddon et al., 2015), suggesting it may be generally applicable. We present some evidence that the model's knowledge graphs help it to impose commonsense constraints on its predictions.

* ICLR 2019 submission

Via

Access Paper or Ask Questions

Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study

Sep 07, 2018
John P. Lalor, Hao Wu, Tsendsuren Munkhdalai, Hong Yu

Figure 1 for Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study

Figure 2 for Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study

Figure 3 for Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study

Figure 4 for Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study

Interpreting the performance of deep learning models beyond test set accuracy is challenging. Characteristics of individual data points are often not considered during evaluation, and each data point is treated equally. We examine the impact of a test set question's difficulty to determine if there is a relationship between difficulty and performance. We model difficulty using well-studied psychometric methods on human response patterns. Experiments on Natural Language Inference (NLI) and Sentiment Analysis (SA) show that the likelihood of answering a question correctly is impacted by the question's difficulty. As DNNs are trained with more data, easy examples are learned more quickly than hard examples.

* EMNLP 2018

Via

Access Paper or Ask Questions

Metalearning with Hebbian Fast Weights

Jul 12, 2018
Tsendsuren Munkhdalai, Adam Trischler

Figure 1 for Metalearning with Hebbian Fast Weights

Figure 2 for Metalearning with Hebbian Fast Weights

Figure 3 for Metalearning with Hebbian Fast Weights

Figure 4 for Metalearning with Hebbian Fast Weights

We unify recent neural approaches to one-shot learning with older ideas of associative memory in a model for metalearning. Our model learns jointly to represent data and to bind class labels to representations in a single shot. It builds representations via slow weights, learned across tasks through SGD, while fast weights constructed by a Hebbian learning rule implement one-shot binding for each new task. On the Omniglot, Mini-ImageNet, and Penn Treebank one-shot learning benchmarks, our model achieves state-of-the-art results.

* 8 pages, 3 figures, 4 tables. arXiv admin note: text overlap with arXiv:1712.09926

Via

Access Paper or Ask Questions

Rapid Adaptation with Conditionally Shifted Neurons

Jul 03, 2018
Tsendsuren Munkhdalai, Xingdi Yuan, Soroush Mehri, Adam Trischler

Figure 1 for Rapid Adaptation with Conditionally Shifted Neurons

Figure 2 for Rapid Adaptation with Conditionally Shifted Neurons

Figure 3 for Rapid Adaptation with Conditionally Shifted Neurons

Figure 4 for Rapid Adaptation with Conditionally Shifted Neurons

We describe a mechanism by which artificial neural networks can learn rapid adaptation - the ability to adapt on the fly, with little data, to new tasks - that we call conditionally shifted neurons. We apply this mechanism in the framework of metalearning, where the aim is to replicate some of the flexibility of human learning in machines. Conditionally shifted neurons modify their activation values with task-specific shifts retrieved from a memory module, which is populated rapidly based on limited task experience. On metalearning benchmarks from the vision and language domains, models augmented with conditionally shifted neurons achieve state-of-the-art results.

* ICML 2018; Added: additional ablation and speed comparison with MetaNet

Via

Access Paper or Ask Questions

Sentence Simplification with Memory-Augmented Neural Networks

Apr 20, 2018
Tu Vu, Baotian Hu, Tsendsuren Munkhdalai, Hong Yu

Figure 1 for Sentence Simplification with Memory-Augmented Neural Networks

Figure 2 for Sentence Simplification with Memory-Augmented Neural Networks

Figure 3 for Sentence Simplification with Memory-Augmented Neural Networks

Figure 4 for Sentence Simplification with Memory-Augmented Neural Networks

Sentence simplification aims to simplify the content and structure of complex sentences, and thus make them easier to interpret for human readers, and easier to process for downstream NLP applications. Recent advances in neural machine translation have paved the way for novel approaches to the task. In this paper, we adapt an architecture with augmented memory capacities called Neural Semantic Encoders (Munkhdalai and Yu, 2017) for sentence simplification. Our experiments demonstrate the effectiveness of our approach on different simplification datasets, both in terms of automatic evaluation measures and human judgments.

* Accepted as a conference paper at NAACL HLT 2018

Via

Access Paper or Ask Questions

Meta Networks

Jun 08, 2017
Tsendsuren Munkhdalai, Hong Yu

Neural networks have been successfully applied in applications with a large amount of labeled data. However, the task of rapid generalization on new concepts with small training data while preserving performances on previously learned ones still presents a significant challenge to neural network models. In this work, we introduce a novel meta learning method, Meta Networks (MetaNet), that learns a meta-level knowledge across tasks and shifts its inductive biases via fast parameterization for rapid generalization. When evaluated on Omniglot and Mini-ImageNet benchmarks, our MetaNet models achieve a near human-level performance and outperform the baseline approaches by up to 6% accuracy. We demonstrate several appealing properties of MetaNet relating to generalization and continual learning.

* Accepted at ICML 2017 - rewrote: the main section; added: MetaNet algorithmic procedure; performed: Mini-ImageNet evaluation

Via

Access Paper or Ask Questions