Alert button
Picture for Xingwu Liu

Xingwu Liu

Alert button

Enhanced Language Representation with Label Knowledge for Span Extraction

Nov 01, 2021
Pan Yang, Xin Cong, Zhenyun Sun, Xingwu Liu

Figure 1 for Enhanced Language Representation with Label Knowledge for Span Extraction
Figure 2 for Enhanced Language Representation with Label Knowledge for Span Extraction
Figure 3 for Enhanced Language Representation with Label Knowledge for Span Extraction
Figure 4 for Enhanced Language Representation with Label Knowledge for Span Extraction

Span extraction, aiming to extract text spans (such as words or phrases) from plain texts, is a fundamental process in Information Extraction. Recent works introduce the label knowledge to enhance the text representation by formalizing the span extraction task into a question answering problem (QA Formalization), which achieves state-of-the-art performance. However, QA Formalization does not fully exploit the label knowledge and suffers from low efficiency in training/inference. To address those problems, we introduce a new paradigm to integrate label knowledge and further propose a novel model to explicitly and efficiently integrate label knowledge into text representations. Specifically, it encodes texts and label annotations independently and then integrates label knowledge into text representation with an elaborate-designed semantics fusion module. We conduct extensive experiments on three typical span extraction tasks: flat NER, nested NER, and event detection. The empirical results show that 1) our method achieves state-of-the-art performance on four benchmarks, and 2) reduces training time and inference time by 76% and 77% on average, respectively, compared with the QA Formalization paradigm. Our code and data are available at https://github.com/Akeepers/LEAR.

* Accepted to the main conference of EMNLP 2021 (long paper) 
Viaarxiv icon

McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds

Sep 09, 2019
Rui Ray Zhang, Xingwu Liu, Yuyi Wang, Liwei Wang

Figure 1 for McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds
Figure 2 for McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds
Figure 3 for McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds

A crucial assumption in most statistical learning theory is that samples are independently and identically distributed (i.i.d.). However, for many real applications, the i.i.d. assumption does not hold. We consider learning problems in which examples are dependent and their dependency relation is characterized by a graph. To establish algorithm-dependent generalization theory for learning with non-i.i.d. data, we first prove novel McDiarmid-type concentration inequalities for Lipschitz functions of graph-dependent random variables. We show that concentration relies on the forest complexity of the graph, which characterizes the strength of the dependency. We demonstrate that for many types of dependent data, the forest complexity is small and thus implies good concentration. Based on our new inequalities we are able to build stability bounds for learning from graph-dependent data.

* accepted as NeurIPS 2019 spotlight paper 
Viaarxiv icon

On the ERM Principle with Networked Data

Nov 22, 2017
Yuanhong Wang, Yuyi Wang, Xingwu Liu, Juhua Pu

Figure 1 for On the ERM Principle with Networked Data
Figure 2 for On the ERM Principle with Networked Data

Networked data, in which every training example involves two objects and may share some common objects with others, is used in many machine learning tasks such as learning to rank and link prediction. A challenge of learning from networked examples is that target values are not known for some pairs of objects. In this case, neither the classical i.i.d.\ assumption nor techniques based on complete U-statistics can be used. Most existing theoretical results of this problem only deal with the classical empirical risk minimization (ERM) principle that always weights every example equally, but this strategy leads to unsatisfactory bounds. We consider general weighted ERM and show new universal risk bounds for this problem. These new bounds naturally define an optimization problem which leads to appropriate weights for networked examples. Though this optimization problem is not convex in general, we devise a new fully polynomial-time approximation scheme (FPTAS) to solve it.

* accepted by AAAI. arXiv admin note: substantial text overlap with arXiv:math/0702683 by other authors 
Viaarxiv icon