Alert button
Picture for Hai Zhao

Hai Zhao

Alert button

Semantic Role Labeling with Associated Memory Network

Aug 05, 2019
Chaoyu Guan, Yuhao Cheng, Hai Zhao

Figure 1 for Semantic Role Labeling with Associated Memory Network
Figure 2 for Semantic Role Labeling with Associated Memory Network
Figure 3 for Semantic Role Labeling with Associated Memory Network
Figure 4 for Semantic Role Labeling with Associated Memory Network

Semantic role labeling (SRL) is a task to recognize all the predicate-argument pairs of a sentence, which has been in a performance improvement bottleneck after a series of latest works were presented. This paper proposes a novel syntax-agnostic SRL model enhanced by the proposed associated memory network (AMN), which makes use of inter-sentence attention of label-known associated sentences as a kind of memory to further enhance dependency-based SRL. In detail, we use sentences and their labels from train dataset as an associated memory cue to help label the target sentence. Furthermore, we compare several associated sentences selecting strategies and label merging methods in AMN to find and utilize the label of associated sentences while attending them. By leveraging the attentive memory from known training data, Our full model reaches state-of-the-art on CoNLL-2009 benchmark datasets for syntax-agnostic setting, showing a new effective research line of SRL enhancement other than exploiting external resources such as well pre-trained language models.

* Published at NAACL 2019; This is camera Ready version; Code is available at https://github.com/Frozenmad/AMN_SRL 
Viaarxiv icon

Head-Driven Phrase Structure Grammar Parsing on Penn Treebank

Jul 08, 2019
Junru Zhou, Hai Zhao

Figure 1 for Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Figure 2 for Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Figure 3 for Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Figure 4 for Head-Driven Phrase Structure Grammar Parsing on Penn Treebank

Head-driven phrase structure grammar (HPSG) enjoys a uniform formalism representing rich contextual syntactic and even semantic meanings. This paper makes the first attempt to formulate a simplified HPSG by integrating constituent and dependency formal representations into head-driven phrase structure. Then two parsing algorithms are respectively proposed for two converted tree representations, division span and joint span. As HPSG encodes both constituent and dependency structure information, the proposed HPSG parsers may be regarded as a sort of joint decoder for both types of structures and thus are evaluated in terms of extracted or converted constituent and dependency parsing trees. Our parser achieves new state-of-the-art performance for both parsing tasks on Penn Treebank (PTB) and Chinese Penn Treebank, verifying the effectiveness of joint learning constituent and dependency structures. In details, we report 95.84 F1 of constituent parsing and 97.00% UAS of dependency parsing on PTB.

* Accepted by ACL 2019 
Viaarxiv icon

Lattice-Based Transformer Encoder for Neural Machine Translation

Jun 04, 2019
Fengshun Xiao, Jiangtong Li, Hai Zhao, Rui Wang, Kehai Chen

Figure 1 for Lattice-Based Transformer Encoder for Neural Machine Translation
Figure 2 for Lattice-Based Transformer Encoder for Neural Machine Translation
Figure 3 for Lattice-Based Transformer Encoder for Neural Machine Translation
Figure 4 for Lattice-Based Transformer Encoder for Neural Machine Translation

Neural machine translation (NMT) takes deterministic sequences for source representations. However, either word-level or subword-level segmentations have multiple choices to split a source sequence with different word segmentors or different subword vocabulary sizes. We hypothesize that the diversity in segmentations may affect the NMT performance. To integrate different segmentations with the state-of-the-art NMT model, Transformer, we propose lattice-based encoders to explore effective word or subword representation in an automatic way during training. We propose two methods: 1) lattice positional encoding and 2) lattice-aware self-attention. These two methods can be used together and show complementary to each other to further improve translation performance. Experiment results show superiorities of lattice-based encoders in word-level and subword-level representations over conventional Transformer encoder.

* Accepted by ACL 2019 
Viaarxiv icon

Judging Chemical Reaction Practicality From Positive Sample only Learning

Apr 22, 2019
Shu Jiang, Zhuosheng Zhang, Hai Zhao, Jiangtong Li, Yang Yang, Bao-Liang Lu, Ning Xia

Figure 1 for Judging Chemical Reaction Practicality From Positive Sample only Learning
Figure 2 for Judging Chemical Reaction Practicality From Positive Sample only Learning
Figure 3 for Judging Chemical Reaction Practicality From Positive Sample only Learning
Figure 4 for Judging Chemical Reaction Practicality From Positive Sample only Learning

Chemical reaction practicality is the core task among all symbol intelligence based chemical information processing, for example, it provides indispensable clue for further automatic synthesis route inference. Considering that chemical reactions have been represented in a language form, we propose a new solution to generally judge the practicality of organic reaction without considering complex quantum physical modeling or chemistry knowledge. While tackling the practicality judgment as a machine learning task from positive and negative (chemical reaction) samples, all existing studies have to carefully handle the serious insufficiency issue on the negative samples. We propose an auto-construction method to well solve the extensively existed long-term difficulty. Experimental results show our model can effectively predict the practicality of chemical reactions, which achieves a high accuracy of 99.76\% on real large-scale chemical lab reaction practicality judgment.

Viaarxiv icon

Span Based Open Information Extraction

Mar 01, 2019
Junlang Zhan, Hai Zhao

Figure 1 for Span Based Open Information Extraction
Figure 2 for Span Based Open Information Extraction

In this paper, we propose a span based model combined with syntactic information for n-ary open information extraction. The advantage of span model is that it can leverage span level features, which is difficult in token based BIO tagging methods. We also improve the previous bootstrap method to construct training corpus. Experiments show that our model outperforms previous open information extraction systems. Our code and data are publicly available at https://github.com/zhanjunlang/Span_OIE

* There is an error in this article. In section 2.2, we state that span level syntactic information is helpful for Open IE, which is one of major contribution of this paper. However, after our examination, there is a fatal error in the code for this part so the statement is not true 
Viaarxiv icon

Dual Co-Matching Network for Multi-choice Reading Comprehension

Jan 27, 2019
Shuailiang Zhang, Hai Zhao, Yuwei Wu, Zhuosheng Zhang, Xi Zhou, Xiang Zhou

Figure 1 for Dual Co-Matching Network for Multi-choice Reading Comprehension
Figure 2 for Dual Co-Matching Network for Multi-choice Reading Comprehension
Figure 3 for Dual Co-Matching Network for Multi-choice Reading Comprehension
Figure 4 for Dual Co-Matching Network for Multi-choice Reading Comprehension

Multi-choice reading comprehension is a challenging task that requires complex reasoning procedure. Given passage and question, a correct answer need to be selected from a set of candidate answers. In this paper, we propose \textbf{D}ual \textbf{C}o-\textbf{M}atching \textbf{N}etwork (\textbf{DCMN}) which model the relationship among passage, question and answer bidirectionally. Different from existing approaches which only calculate question-aware or option-aware passage representation, we calculate passage-aware question representation and passage-aware answer representation at the same time. To demonstrate the effectiveness of our model, we evaluate our model on a large-scale multiple choice machine reading comprehension dataset({\em i.e.} RACE). Experimental result show that our proposed model achieves new state-of-the-art results.

* arXiv admin note: text overlap with arXiv:1806.04068 by other authors 
Viaarxiv icon

Chemical Names Standardization using Neural Sequence to Sequence Model

Jan 21, 2019
Junlang Zhan, Hai Zhao

Figure 1 for Chemical Names Standardization using Neural Sequence to Sequence Model
Figure 2 for Chemical Names Standardization using Neural Sequence to Sequence Model
Figure 3 for Chemical Names Standardization using Neural Sequence to Sequence Model
Figure 4 for Chemical Names Standardization using Neural Sequence to Sequence Model

Chemical information extraction is to convert chemical knowledge in text into true chemical database, which is a text processing task heavily relying on chemical compound name identification and standardization. Once a systematic name for a chemical compound is given, it will naturally and much simply convert the name into the eventually required molecular formula. However, for many chemical substances, they have been shown in many other names besides their systematic names which poses a great challenge for this task. In this paper, we propose a framework to do the auto standardization from the non-systematic names to the corresponding systematic names by using the spelling error correction, byte pair encoding tokenization and neural sequence to sequence model. Our framework is trained end to end and is fully data-driven. Our standardization accuracy on the test dataset achieves 54.04% which has a great improvement compared to previous state-of-the-art result.

Viaarxiv icon

Chinese Word Segmentation: Another Decade Review (2007-2017)

Jan 18, 2019
Hai Zhao, Deng Cai, Changning Huang, Chunyu Kit

This paper reviews the development of Chinese word segmentation (CWS) in the most recent decade, 2007-2017. Special attention was paid to the deep learning technologies that has already permeated into most areas of natural language processing (NLP). The basic view we have arrived at is that compared to traditional supervised learning methods, neural network based methods have not shown any superior performance. The most critical challenge still lies on balancing of recognition of in-vocabulary (IV) and out-of-vocabulary (OOV) words. However, as neural models have potentials to capture the essential linguistic structure of natural language, we are optimistic about significant progresses may arrive in the near future.

* in Chinese 
Viaarxiv icon

Dependency or Span, End-to-End Uniform Semantic Role Labeling

Jan 16, 2019
Zuchao Li, Shexia He, Hai Zhao, Yiqing Zhang, Zhuosheng Zhang, Xi Zhou, Xiang Zhou

Figure 1 for Dependency or Span, End-to-End Uniform Semantic Role Labeling
Figure 2 for Dependency or Span, End-to-End Uniform Semantic Role Labeling
Figure 3 for Dependency or Span, End-to-End Uniform Semantic Role Labeling
Figure 4 for Dependency or Span, End-to-End Uniform Semantic Role Labeling

Semantic role labeling (SRL) aims to discover the predicateargument structure of a sentence. End-to-end SRL without syntactic input has received great attention. However, most of them focus on either span-based or dependency-based semantic representation form and only show specific model optimization respectively. Meanwhile, handling these two SRL tasks uniformly was less successful. This paper presents an end-to-end model for both dependency and span SRL with a unified argument representation to deal with two different types of argument annotations in a uniform fashion. Furthermore, we jointly predict all predicates and arguments, especially including long-term ignored predicate identification subtask. Our single model achieves new state-of-the-art results on both span (CoNLL 2005, 2012) and dependency (CoNLL 2008, 2009) SRL benchmarks.

Viaarxiv icon