Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qun Liu

DeepSat V2: Feature Augmented Convolutional Neural Nets for Satellite Image Classification

Nov 15, 2019
Qun Liu, Saikat Basu, Sangram Ganguly, Supratik Mukhopadhyay, Robert DiBiano, Manohar Karki, Ramakrishna Nemani

Figure 1 for DeepSat V2: Feature Augmented Convolutional Neural Nets for Satellite Image Classification

Figure 2 for DeepSat V2: Feature Augmented Convolutional Neural Nets for Satellite Image Classification

Figure 3 for DeepSat V2: Feature Augmented Convolutional Neural Nets for Satellite Image Classification

Figure 4 for DeepSat V2: Feature Augmented Convolutional Neural Nets for Satellite Image Classification

Satellite image classification is a challenging problem that lies at the crossroads of remote sensing, computer vision, and machine learning. Due to the high variability inherent in satellite data, most of the current object classification approaches are not suitable for handling satellite datasets. The progress of satellite image analytics has also been inhibited by the lack of a single labeled high-resolution dataset with multiple class labels. In a preliminary version of this work, we introduced two new high resolution satellite imagery datasets (SAT-4 and SAT-6) and proposed DeepSat framework for classification based on "handcrafted" features and a deep belief network (DBN). The present paper is an extended version, we present an end-to-end framework leveraging an improved architecture that augments a convolutional neural network (CNN) with handcrafted features (instead of using DBN-based architecture) for classification. Our framework, having access to fused spatial information obtained from handcrafted features as well as CNN feature maps, have achieved accuracies of 99.90% and 99.84% respectively, on SAT-4 and SAT-6, surpassing all the other state-of-the-art results. A statistical analysis based on Distribution Separability Criterion substantiates the robustness of our approach in learning better representations for satellite imagery.

* This is an Accepted Manuscript of an article published by Taylor & Francis Group in Remote Sensing Letters. arXiv admin note: text overlap with arXiv:1509.03602

Via

Access Paper or Ask Questions

Textual Adversarial Attack as Combinatorial Optimization

Nov 10, 2019
Yuan Zang, Chenghao Yang, Fanchao Qi, Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun

Figure 1 for Textual Adversarial Attack as Combinatorial Optimization

Figure 2 for Textual Adversarial Attack as Combinatorial Optimization

Figure 3 for Textual Adversarial Attack as Combinatorial Optimization

Figure 4 for Textual Adversarial Attack as Combinatorial Optimization

Adversarial attack is carried out to reveal the vulnerability of deep neural networks. Textual adversarial attack is challenging because text is discrete and any perturbation might bring big semantic change. Word substitution is a class of effective textual attack method and has been extensively explored. However, all existing word substitution-based attack methods suffer the problems of bad semantic preservation, insufficient adversarial examples or suboptimal attack results. In this paper, we formalize the word substitution-based attack as a combinatorial optimization problem. We also propose a novel attack model, which comprises a sememe-based word substitution strategy and the particle swarm optimization algorithm, to tackle the existing problems. In experiments, we evaluate our attack model on the sentiment analysis task. Experimental results demonstrate our model achieves higher attack success rates and less modification than the baseline methods. The ablation study also verifies the superiority of the two parts of our model over previous ones.

* Work in progress. 6 pages, 1 figure

Via

Access Paper or Ask Questions

Zero-Shot Paraphrase Generation with Multilingual Language Models

Nov 09, 2019
Yinpeng Guo, Yi Liao, Xin Jiang, Qing Zhang, Yibo Zhang, Qun Liu

Figure 1 for Zero-Shot Paraphrase Generation with Multilingual Language Models

Figure 2 for Zero-Shot Paraphrase Generation with Multilingual Language Models

Figure 3 for Zero-Shot Paraphrase Generation with Multilingual Language Models

Figure 4 for Zero-Shot Paraphrase Generation with Multilingual Language Models

Leveraging multilingual parallel texts to automatically generate paraphrases has drawn much attention as size of high-quality paraphrase corpus is limited. Round-trip translation, also known as the pivoting method, is a typical approach to this end. However, we notice that the pivoting process involves multiple machine translation models and is likely to incur semantic drift during the two-step translations. In this paper, inspired by the Transformer-based language models, we propose a simple and unified paraphrasing model, which is purely trained on multilingual parallel data and can conduct zero-shot paraphrase generation in one step. Compared with the pivoting approach, paraphrases generated by our model is more semantically similar to the input sentence. Moreover, since our model shares the same architecture as GPT (Radford et al., 2018), we are able to pre-train the model on large-scale unparallel corpus, which further improves the fluency of the output sentences. In addition, we introduce the mechanism of denoising auto-encoder (DAE) to improve diversity and robustness of the model. Experimental results show that our model surpasses the pivoting method in terms of relevance, diversity, fluency and efficiency.

Via

Access Paper or Ask Questions

How to Do Simultaneous Translation Better with Consecutive Neural Machine Translation?

Nov 08, 2019
Yun Chen, Liangyou Li, Xin Jiang, Xiao Chen, Qun Liu

Figure 1 for How to Do Simultaneous Translation Better with Consecutive Neural Machine Translation?

Figure 2 for How to Do Simultaneous Translation Better with Consecutive Neural Machine Translation?

Figure 3 for How to Do Simultaneous Translation Better with Consecutive Neural Machine Translation?

Figure 4 for How to Do Simultaneous Translation Better with Consecutive Neural Machine Translation?

Despite the success of neural machine translation (NMT), simultaneous neural machine translation (SNMT), the task of translating in real time before a full sentence has been observed, remains challenging due to the syntactic structure difference and simultaneity requirements. In this paper, we propose a general framework to improve simultaneous translation with a pretrained consecutive neural machine translation (CNMT) model. Our framework contains two parts: prefix translation that utilizes a pretrained CNMT model to better translate source prefixes and a stopping criterion that determines when to stop the prefix translation. Experiments on three translation corpora and two language pairs show the efficacy of the proposed framework on balancing the quality and latency in simultaneous translation.

Via

Access Paper or Ask Questions

Pretrained Language Models for Document-Level Neural Machine Translation

Nov 08, 2019
Liangyou Li, Xin Jiang, Qun Liu

Figure 1 for Pretrained Language Models for Document-Level Neural Machine Translation

Figure 2 for Pretrained Language Models for Document-Level Neural Machine Translation

Figure 3 for Pretrained Language Models for Document-Level Neural Machine Translation

Figure 4 for Pretrained Language Models for Document-Level Neural Machine Translation

Previous work on document-level NMT usually focuses on limited contexts because of degraded performance on larger contexts. In this paper, we investigate on using large contexts with three main contributions: (1) Different from previous work which pertrained models on large-scale sentence-level parallel corpora, we use pretrained language models, specifically BERT, which are trained on monolingual documents; (2) We propose context manipulation methods to control the influence of large contexts, which lead to comparable results on systems using small and large contexts; (3) We introduce a multi-task training for regularization to avoid models overfitting our training corpora, which further improves our systems together with a deeper encoder. Experiments are conducted on the widely used IWSLT data sets with three language pairs, i.e., Chinese--English, French--English and Spanish--English. Results show that our systems are significantly better than three previously reported document-level systems.

Via

Access Paper or Ask Questions

Open the Boxes of Words: Incorporating Sememes into Textual Adversarial Attack

Oct 27, 2019
Yuan Zang, Chenghao Yang, Fanchao Qi, Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun

Figure 1 for Open the Boxes of Words: Incorporating Sememes into Textual Adversarial Attack

Figure 2 for Open the Boxes of Words: Incorporating Sememes into Textual Adversarial Attack

Figure 3 for Open the Boxes of Words: Incorporating Sememes into Textual Adversarial Attack

Figure 4 for Open the Boxes of Words: Incorporating Sememes into Textual Adversarial Attack

Adversarial attack is carried out to reveal the vulnerability of deep neural networks. Word substitution is a class of effective adversarial textual attack method, which has been extensively explored. However, all existing studies utilize word embeddings or thesauruses to find substitutes. In this paper, we incorporate sememes, the minimum semantic units, into adversarial attack. We propose an efficient sememe-based word substitution strategy and integrate it into a genetic attack algorithm. In experiments, we employ our attack method to attack LSTM and BERT on both Chinese and English sentiment analysis as well as natural language inference benchmark datasets. Experimental results demonstrate our model achieves better attack success rates and less modification than the baseline methods based on word embedding or synonym. Furthermore, we find our attack model can bring more robustness enhancement to the target model with adversarial training.

* Work in progress. 5 pages, 1 figure

Via

Access Paper or Ask Questions

Enhancing Recurrent Neural Networks with Sememes

Oct 20, 2019
Yujia Qin, Fanchao Qi, Sicong Ouyang, Zhiyuan Liu, Cheng Yang, Yasheng Wang, Qun Liu, Maosong Sun

Figure 1 for Enhancing Recurrent Neural Networks with Sememes

Figure 2 for Enhancing Recurrent Neural Networks with Sememes

Figure 3 for Enhancing Recurrent Neural Networks with Sememes

Figure 4 for Enhancing Recurrent Neural Networks with Sememes

Sememes, the minimum semantic units of human languages, have been successfully utilized in various natural language processing applications. However, most existing studies exploit sememes in specific tasks and few efforts are made to utilize sememes more fundamentally. In this paper, we propose to incorporate sememes into recurrent neural networks (RNNs) to improve their sequence modeling ability, which is beneficial to all kinds of downstream tasks. We design three different sememe incorporation methods and employ them in typical RNNs including LSTM, GRU and their bidirectional variants. For evaluation, we use several benchmark datasets involving PTB and WikiText-2 for language modeling, SNLI for natural language inference. Experimental results show evident and consistent improvement of our sememe-incorporated models compared with vanilla RNNs, which proves the effectiveness of our sememe incorporation methods. Moreover, we find the sememe-incorporated models have great robustness and outperform adversarial training in defending adversarial attack. All the code and data of this work will be made available to the public.

* 8 pages, 2 figures

Via

Access Paper or Ask Questions

TinyBERT: Distilling BERT for Natural Language Understanding

Sep 24, 2019
Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu

Figure 1 for TinyBERT: Distilling BERT for Natural Language Understanding

Figure 2 for TinyBERT: Distilling BERT for Natural Language Understanding

Figure 3 for TinyBERT: Distilling BERT for Natural Language Understanding

Figure 4 for TinyBERT: Distilling BERT for Natural Language Understanding

Language model pre-training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive and memory intensive, so it is difficult to effectively execute them on some resource-restricted devices. To accelerate inference and reduce model size while maintaining accuracy, we firstly propose a novel transformer distillation method that is a specially designed knowledge distillation (KD) method for transformer-based models. By leveraging this new KD method, the plenty of knowledge encoded in a large teacher BERT can be well transferred to a small student TinyBERT. Moreover, we introduce a new two-stage learning framework for TinyBERT, which performs transformer distillation at both the pre-training and task-specific learning stages. This framework ensures that TinyBERT can capture both the general-domain and task-specific knowledge of the teacher BERT. TinyBERT is empirically effective and achieves comparable results with BERT in GLUE datasets, while being 7.5x smaller and 9.4x faster on inference. TinyBERT is also significantly better than state-of-the-art baselines, even with only about 28% parameters and 31% inference time of baselines.

* 13 pages, 2 figures, 9 tables

Via

Access Paper or Ask Questions

NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Sep 05, 2019
Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen, Qun Liu

Figure 1 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Figure 2 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Figure 3 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Figure 4 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding

The pre-trained language models have achieved great successes in various natural language understanding (NLU) tasks due to its capacity to capture the deep contextualized information in text by pre-training on large-scale corpora. In this technical report, we present our practice of pre-training language models named NEZHA (NEural contextualiZed representation for CHinese lAnguage understanding) on Chinese corpora and finetuning for the Chinese NLU tasks. The current version of NEZHA is based on BERT with a collection of proven improvements, which include Functional Relative Positional Encoding as an effective positional encoding scheme, Whole Word Masking strategy, Mixed Precision Training and the LAMB Optimizer in training the models. The experimental results show that NEZHA achieves the state-of-the-art performances when finetuned on several representative Chinese tasks, including named entity recognition (People's Daily NER), sentence matching (LCQMC), Chinese sentiment classification (ChnSenti) and natural language inference (XNLI).

Via

Access Paper or Ask Questions