Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Frank F. Xu

Minimally Supervised Categorization of Text with Metadata

Jun 02, 2020
Yu Zhang, Yu Meng, Jiaxin Huang, Frank F. Xu, Xuan Wang, Jiawei Han

Figure 1 for Minimally Supervised Categorization of Text with Metadata

Figure 2 for Minimally Supervised Categorization of Text with Metadata

Figure 3 for Minimally Supervised Categorization of Text with Metadata

Figure 4 for Minimally Supervised Categorization of Text with Metadata

Document categorization, which aims to assign a topic label to each document, plays a fundamental role in a wide variety of applications. Despite the success of existing studies in conventional supervised document classification, they are less concerned with two real problems: (1) \textit{the presence of metadata}: in many domains, text is accompanied by various additional information such as authors and tags. Such metadata serve as compelling topic indicators and should be leveraged into the categorization framework; (2) \textit{label scarcity}: labeled training samples are expensive to obtain in some cases, where categorization needs to be performed using only a small set of annotated data. In recognition of these two challenges, we propose \textsc{MetaCat}, a minimally supervised framework to categorize text with metadata. Specifically, we develop a generative process describing the relationships between words, documents, labels, and metadata. Guided by the generative model, we embed text and metadata into the same semantic space to encode heterogeneous signals. Then, based on the same generative process, we synthesize training samples to address the bottleneck of label scarcity. We conduct a thorough evaluation on a wide range of datasets. Experimental results prove the effectiveness of \textsc{MetaCat} over many competitive baselines.

* 10 pages; Accepted to SIGIR 2020

Via

Access Paper or Ask Questions

A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos

May 02, 2020
Frank F. Xu, Lei Ji, Botian Shi, Junyi Du, Graham Neubig, Yonatan Bisk, Nan Duan

Figure 1 for A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos

Figure 2 for A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos

Figure 3 for A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos

Figure 4 for A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos

Procedural knowledge, which we define as concrete information about the sequence of actions that go into performing a particular procedure, plays an important role in understanding real-world tasks and actions. Humans often learn this knowledge from instructional text and video, and in this paper we aim to perform automatic extraction of this knowledge in a similar way. As a concrete step in this direction, we propose the new task of inferring procedures in a structured form(a data structure containing verbs and arguments) from multimodal instructional video contents and their corresponding transcripts. We first create a manually annotated, large evaluation dataset including over350 instructional cooking videos along with over 15,000 English sentences in transcripts spanning over 89 recipes. We conduct analysis of the challenges posed by this task and dataset with experiments with unsupervised segmentation, semantic role labeling, and visual action detection based baselines. The dataset and code will be publicly available at https://github.com/frankxu2004/cooking-procedural-extraction.

Via

Access Paper or Ask Questions

Incorporating External Knowledge through Pre-training for Natural Language to Code Generation

Apr 20, 2020
Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig

Figure 1 for Incorporating External Knowledge through Pre-training for Natural Language to Code Generation

Figure 2 for Incorporating External Knowledge through Pre-training for Natural Language to Code Generation

Figure 3 for Incorporating External Knowledge through Pre-training for Natural Language to Code Generation

Figure 4 for Incorporating External Knowledge through Pre-training for Natural Language to Code Generation

Open-domain code generation aims to generate code in a general-purpose programming language (such as Python) from natural language (NL) intents. Motivated by the intuition that developers usually retrieve resources on the web when writing code, we explore the effectiveness of incorporating two varieties of external knowledge into NL-to-code generation: automatically mined NL-code pairs from the online programming QA forum StackOverflow and programming language API documentation. Our evaluations show that combining the two sources with data augmentation and retrieval-based data re-sampling improves the current state-of-the-art by up to 2.2% absolute BLEU score on the code generation testbed CoNaLa. The code and resources are available at https://github.com/neulab/external-knowledge-codegen.

* Accepted by ACL 2020

Via

Access Paper or Ask Questions

How Can We Know What Language Models Know?

Nov 28, 2019
Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig

Figure 1 for How Can We Know What Language Models Know?

Figure 2 for How Can We Know What Language Models Know?

Figure 3 for How Can We Know What Language Models Know?

Figure 4 for How Can We Know What Language Models Know?

Recent work has presented intriguing results examining the knowledge contained in language models (LM) by having the LM fill in the blanks of prompts such as "Obama is a _ by profession". These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as "Obama worked as a _" may result in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods to automatically generate high-quality and diverse prompts and ensemble methods to combine answers from different prompts. Extensive experiments on the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from 31.1% to 38.1%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA.

Via

Access Paper or Ask Questions

HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories

Oct 16, 2019
Yu Zhang, Frank F. Xu, Sha Li, Yu Meng, Xuan Wang, Qi Li, Jiawei Han

Figure 1 for HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories

Figure 2 for HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories

Figure 3 for HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories

Figure 4 for HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories

GitHub has become an important platform for code sharing and scientific exchange. With the massive number of repositories available, there is a pressing need for topic-based search. Even though the topic label functionality has been introduced, the majority of GitHub repositories do not have any labels, impeding the utility of search and topic-based analysis. This work targets the automatic repository classification problem as \textit{keyword-driven hierarchical classification}. Specifically, users only need to provide a label hierarchy with keywords to supply as supervision. This setting is flexible, adaptive to the users' needs, accounts for the different granularity of topic labels and requires minimal human effort. We identify three key challenges of this problem, namely (1) the presence of multi-modal signals; (2) supervision scarcity and bias; (3) supervision format mismatch. In recognition of these challenges, we propose the \textsc{HiGitClass} framework, comprising of three modules: heterogeneous information network embedding; keyword enrichment; topic modeling and pseudo document generation. Experimental results on two GitHub repository collections confirm that \textsc{HiGitClass} is superior to existing weakly-supervised and dataless hierarchical classification methods, especially in its ability to integrate both structured and unstructured data for repository classification.

* 10 pages; Accepted to ICDM 2019

Via

Access Paper or Ask Questions

StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

Aug 20, 2019
Anhong Guo, Junhan Kong, Michael Rivera, Frank F. Xu, Jeffrey P. Bigham

Figure 1 for StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

Figure 2 for StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

Figure 3 for StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

Figure 4 for StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

Blind people frequently encounter inaccessible dynamic touchscreens in their everyday lives that are difficult, frustrating, and often impossible to use independently. Touchscreens are often the only way to control everything from coffee machines and payment terminals, to subway ticket machines and in-flight entertainment systems. Interacting with dynamic touchscreens is difficult non-visually because the visual user interfaces change, interactions often occur over multiple different screens, and it is easy to accidentally trigger interface actions while exploring the screen. To solve these problems, we introduce StateLens - a three-part reverse engineering solution that makes existing dynamic touchscreens accessible. First, StateLens reverse engineers the underlying state diagrams of existing interfaces using point-of-view videos found online or taken by users using a hybrid crowd-computer vision pipeline. Second, using the state diagrams, StateLens automatically generates conversational agents to guide blind users through specifying the tasks that the interface can perform, allowing the StateLens iOS application to provide interactive guidance and feedback so that blind users can access the interface. Finally, a set of 3D-printed accessories enable blind people to explore capacitive touchscreens without the risk of triggering accidental touches on the interface. Our technical evaluation shows that StateLens can accurately reconstruct interfaces from stationary, hand-held, and web videos; and, a user study of the complete system demonstrates that StateLens successfully enables blind users to access otherwise inaccessible dynamic touchscreens.

* ACM UIST 2019

Via

Access Paper or Ask Questions

Incorporating Diversity into Influential Node Mining

Oct 29, 2018
Yu Zhang, Frank F. Xu, Tianshu Lyu, Xiang Ren, Jiawei Han

Figure 1 for Incorporating Diversity into Influential Node Mining

Figure 2 for Incorporating Diversity into Influential Node Mining

Figure 3 for Incorporating Diversity into Influential Node Mining

Figure 4 for Incorporating Diversity into Influential Node Mining

Diversity is a crucial criterion in many ranking and mining tasks. In this paper, we study how to incorporate node diversity into influence maximization (IM). We consider diversity as a reverse measure of the average similarity between selected nodes, which can be specified using node embedding or community detection results. Our goal is to identify a set of nodes which are simultaneously influential and diverse. Three most commonly used utilities in economics (i.e., Perfect Substitutes, Perfect Complements, and Cobb-Douglas) are proposed to jointly model influence spread and diversity as two factors. We formulate diversified IM as an optimization problem of these utilities, for which we present two approximation algorithms based on non-monotonic submodular maximization and traditional IM respectively. Experimental results show that our diversified IM framework outperforms other natural heuristics, such as embedding and diversified ranking, both in utility maximization and result diversification.

Via

Access Paper or Ask Questions

Automatic Extraction of Commonsense LocatedNear Knowledge

May 13, 2018
Frank F. Xu, Bill Yuchen Lin, Kenny Q. Zhu

Figure 1 for Automatic Extraction of Commonsense LocatedNear Knowledge

Figure 2 for Automatic Extraction of Commonsense LocatedNear Knowledge

Figure 3 for Automatic Extraction of Commonsense LocatedNear Knowledge

Figure 4 for Automatic Extraction of Commonsense LocatedNear Knowledge

LocatedNear relation is a kind of commonsense knowledge describing two physical objects that are typically found near each other in real life. In this paper, we study how to automatically extract such relationship through a sentence-level relation classifier and aggregating the scores of entity pairs from a large corpus. Also, we release two benchmark datasets for evaluation and future research.

* Accepted by ACL 2018. A preliminary version is presented on AKBC@NIPS'17

Via

Access Paper or Ask Questions

Empower Sequence Labeling with Task-Aware Neural Language Model

Nov 23, 2017
Liyuan Liu, Jingbo Shang, Frank F. Xu, Xiang Ren, Huan Gui, Jian Peng, Jiawei Han

Figure 1 for Empower Sequence Labeling with Task-Aware Neural Language Model

Figure 2 for Empower Sequence Labeling with Task-Aware Neural Language Model

Figure 3 for Empower Sequence Labeling with Task-Aware Neural Language Model

Figure 4 for Empower Sequence Labeling with Task-Aware Neural Language Model

Linguistic sequence labeling is a general modeling approach that encompasses a variety of problems, such as part-of-speech tagging and named entity recognition. Recent advances in neural networks (NNs) make it possible to build reliable models without handcrafted features. However, in many cases, it is hard to obtain sufficient annotations to train these models. In this study, we develop a novel neural framework to extract abundant knowledge hidden in raw texts to empower the sequence labeling task. Besides word-level knowledge contained in pre-trained word embeddings, character-aware neural language models are incorporated to extract character-level knowledge. Transfer learning techniques are further adopted to mediate different components and guide the language model towards the key knowledge. Comparing to previous methods, these task-specific knowledge allows us to adopt a more concise model and conduct more efficient training. Different from most transfer learning methods, the proposed framework does not rely on any additional supervision. It extracts knowledge from self-contained order information of training sequences. Extensive experiments on benchmark datasets demonstrate the effectiveness of leveraging character-level knowledge and the efficiency of co-training. For example, on the CoNLL03 NER task, model training completes in about 6 hours on a single GPU, reaching F1 score of 91.71$\pm$0.10 without using any extra annotation.

* AAAI 2018

Via

Access Paper or Ask Questions

Indirect Supervision for Relation Extraction using Question-Answer Pairs

Nov 23, 2017
Zeqiu Wu, Xiang Ren, Frank F. Xu, Ji Li, Jiawei Han

Figure 1 for Indirect Supervision for Relation Extraction using Question-Answer Pairs

Figure 2 for Indirect Supervision for Relation Extraction using Question-Answer Pairs

Figure 3 for Indirect Supervision for Relation Extraction using Question-Answer Pairs

Figure 4 for Indirect Supervision for Relation Extraction using Question-Answer Pairs

Automatic relation extraction (RE) for types of interest is of great importance for interpreting massive text corpora in an efficient manner. Traditional RE models have heavily relied on human-annotated corpus for training, which can be costly in generating labeled data and become obstacles when dealing with more relation types. Thus, more RE extraction systems have shifted to be built upon training data automatically acquired by linking to knowledge bases (distant supervision). However, due to the incompleteness of knowledge bases and the context-agnostic labeling, the training data collected via distant supervision (DS) can be very noisy. In recent years, as increasing attention has been brought to tackling question-answering (QA) tasks, user feedback or datasets of such tasks become more accessible. In this paper, we propose a novel framework, ReQuest, to leverage question-answer pairs as an indirect source of supervision for relation extraction, and study how to use such supervision to reduce noise induced from DS. Our model jointly embeds relation mentions, types, QA entity mention pairs and text features in two low-dimensional spaces (RE and QA), where objects with same relation types or semantically similar question-answer pairs have similar representations. Shared features connect these two spaces, carrying clearer semantic knowledge from both sources. ReQuest, then use these learned embeddings to estimate the types of test relation mentions. We formulate a global objective function and adopt a novel margin-based QA loss to reduce noise in DS by exploiting semantic evidence from the QA dataset. Our experimental results achieve an average of 11% improvement in F1 score on two public RE datasets combined with TREC QA dataset.

* 9 pages + 1 page reference. Accepted to WSDM 2018

Via

Access Paper or Ask Questions