Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Nov 22, 2018
Shipeng Wang, Jian Sun, Zongben Xu

Deep neural networks are traditionally trained using human-designed stochastic optimization algorithms, such as SGD and Adam. Recently, the approach of learning to optimize network parameters has emerged as a promising research topic. However, these learned black-box optimizers sometimes do not fully utilize the experience in human-designed optimizers, therefore have limitation in generalization ability. In this paper, a new optimizer, dubbed as \textit{HyperAdam}, is proposed that combines the idea of "learning to optimize" and traditional Adam optimizer. Given a network for training, its parameter update in each iteration generated by HyperAdam is an adaptive combination of multiple updates generated by Adam with varying decay rates. The combination weights and decay rates in HyperAdam are adaptively learned depending on the task. HyperAdam is modeled as a recurrent neural network with AdamCell, WeightCell and StateCell. It is justified to be state-of-the-art for various network training, such as multilayer perceptron, CNN and LSTM.

  Access Paper or Ask Questions

Deep Neural Ranking for Crowdsourced Geopolitical Event Forecasting

Oct 23, 2018
Giuseppe Nebbione, Derek Doran, Srikanth Nadella, Brandon Minnery

There are many examples of 'wisdom of the crowd' effects in which the large number of participants imparts confidence in the collective judgment of the crowd. But how do we form an aggregated judgment when the size of the crowd is limited? Whose judgments do we include, and whose do we accord the most weight? This paper considers this problem in the context of geopolitical event forecasting, where volunteer analysts are queried to give their expertise, confidence, and predictions about the outcome of an event. We develop a forecast aggregation model that integrates topical information about a question, meta-data about a pair of forecasters, and their predictions in a deep siamese neural network that decides which forecasters' predictions are more likely to be close to the correct response. A ranking of the forecasters is induced from a tournament of pair-wise forecaster comparisons, with the ranking used to create an aggregate forecast. Preliminary results find the aggregate prediction of the best forecasters ranked by our deep siamese network model consistently beats typical aggregation techniques by Brier score.

  Access Paper or Ask Questions

Fully Convolutional Network for Automatic Road Extraction from Satellite Imagery

Jun 19, 2018
Alexander V. Buslaev, Selim S. Seferbekov, Vladimir I. Iglovikov, Alexey A. Shvets

Analysis of high-resolution satellite images has been an important research topic for traffic management, city planning, and road monitoring. One of the problems here is automatic and precise road extraction. From an original image, it is difficult and computationally expensive to extract roads due to presences of other road-like features with straight edges. In this paper, we propose an approach for automatic road extraction based on a fully convolutional neural network of U-net family. This network consists of ResNet-34 pre-trained on ImageNet and decoder adapted from vanilla U-Net. Based on validation results, leaderboard and our own experience this network shows superior results for the DEEPGLOBE - CVPR 2018 road extraction sub-challenge. Moreover, this network uses moderate memory that allows using just one GTX 1080 or 1080ti video cards to perform whole training and makes pretty fast predictions.

* arXiv admin note: substantial text overlap with arXiv:1806.03510, arXiv:1804.08024, arXiv:1801.05746, arXiv:1803.01207 

  Access Paper or Ask Questions

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

Dec 18, 2017
Santiago Pascual, Maruchan Park, Joan Serrà, Antonio Bonafonte, Kang-Hun Ahn

Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments an important topic. In this work, we present the results of adapting a speech enhancement generative adversarial network by finetuning the generator with small amounts of data. We investigate the minimum requirements to obtain a stable behavior in terms of several objective metrics in two very different languages: Catalan and Korean. We also study the variability of test performance to unseen noise as a function of the amount of different types of noise available for training. Results show that adapting a pre-trained English model with 10 min of data already achieves a comparable performance to having two orders of magnitude more data. They also demonstrate the relative stability in test performance with respect to the number of training noise types.

  Access Paper or Ask Questions

Visual and Textual Sentiment Analysis Using Deep Fusion Convolutional Neural Networks

Nov 21, 2017
Xingyue Chen, Yunhong Wang, Qingjie Liu

Sentiment analysis is attracting more and more attentions and has become a very hot research topic due to its potential applications in personalized recommendation, opinion mining, etc. Most of the existing methods are based on either textual or visual data and can not achieve satisfactory results, as it is very hard to extract sufficient information from only one single modality data. Inspired by the observation that there exists strong semantic correlation between visual and textual data in social medias, we propose an end-to-end deep fusion convolutional neural network to jointly learn textual and visual sentiment representations from training examples. The two modality information are fused together in a pooling layer and fed into fully-connected layers to predict the sentiment polarity. We evaluate the proposed approach on two widely used data sets. Results show that our method achieves promising result compared with the state-of-the-art methods which clearly demonstrate its competency.

* Accepted as oral presentation by ICIP2017 

  Access Paper or Ask Questions

Joint Named Entity Recognition and Stance Detection in Tweets

Jul 30, 2017
Dilek Küçük

Named entity recognition (NER) is a well-established task of information extraction which has been studied for decades. More recently, studies reporting NER experiments on social media texts have emerged. On the other hand, stance detection is a considerably new research topic usually considered within the scope of sentiment analysis. Stance detection studies are mostly applied to texts of online debates where the stance of the text owner for a particular target, either explicitly or implicitly mentioned in text, is explored. In this study, we investigate the possible contribution of named entities to the stance detection task in tweets. We report the evaluation results of NER experiments as well as that of the subsequent stance detection experiments using named entities, on a publicly-available stance-annotated data set of tweets. Our results indicate that named entities obtained with a high-performance NER system can contribute to stance detection performance on tweets.

* 5 pages 

  Access Paper or Ask Questions

SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions

Jun 17, 2017
Han Xiao, Minlie Huang, Xiaoyan Zhu

Knowledge representation is an important, long-history topic in AI, and there have been a large amount of work for knowledge graph embedding which projects symbolic entities and relations into low-dimensional, real-valued vector space. However, most embedding methods merely concentrate on data fitting and ignore the explicit semantic expression, leading to uninterpretable representations. Thus, traditional embedding methods have limited potentials for many applications such as question answering, and entity classification. To this end, this paper proposes a semantic representation method for knowledge graph \textbf{(KSR)}, which imposes a two-level hierarchical generative process that globally extracts many aspects and then locally assigns a specific category in each aspect for every triple. Since both aspects and categories are semantics-relevant, the collection of categories in each aspect is treated as the semantic representation of this triple. Extensive experiments justify our model outperforms other state-of-the-art baselines substantially.

* Submitted to AAAI.2017 

  Access Paper or Ask Questions

Salient Object Detection with Semantic Priors

May 23, 2017
Tam V. Nguyen, Luoqi Liu

Salient object detection has increasingly become a popular topic in cognitive and computational sciences, including computer vision and artificial intelligence research. In this paper, we propose integrating \textit{semantic priors} into the salient object detection process. Our algorithm consists of three basic steps. Firstly, the explicit saliency map is obtained based on the semantic segmentation refined by the explicit saliency priors learned from the data. Next, the implicit saliency map is computed based on a trained model which maps the implicit saliency priors embedded into regional features with the saliency values. Finally, the explicit semantic map and the implicit map are adaptively fused to form a pixel-accurate saliency map which uniformly covers the objects of interest. We further evaluate the proposed framework on two challenging datasets, namely, ECSSD and HKUIS. The extensive experimental results demonstrate that our method outperforms other state-of-the-art methods.

* accepted to IJCAI 2017 

  Access Paper or Ask Questions

Online Nonnegative Matrix Factorization with General Divergences

Aug 02, 2016
Renbo Zhao, Vincent Y. F. Tan, Huan Xu

We develop a unified and systematic framework for performing online nonnegative matrix factorization under a wide variety of important divergences. The online nature of our algorithm makes it particularly amenable to large-scale data. We prove that the sequence of learned dictionaries converges almost surely to the set of critical points of the expected loss function. We do so by leveraging the theory of stochastic approximations and projected dynamical systems. This result substantially generalizes the previous results obtained only for the squared-$\ell_2$ loss. Moreover, the novel techniques involved in our analysis open new avenues for analyzing similar matrix factorization problems. The computational efficiency and the quality of the learned dictionary of our algorithm are verified empirically on both synthetic and real datasets. In particular, on the tasks of topic learning, shadow removal and image denoising, our algorithm achieves superior trade-offs between the quality of learned dictionary and running time over the batch and other online NMF algorithms.

  Access Paper or Ask Questions

Modeling meaning: computational interpreting and understanding of natural language fragments

Jun 29, 2016
Michael Kapustin, Pavlo Kapustin

In this introductory article we present the basics of an approach to implementing computational interpreting of natural language aiming to model the meanings of words and phrases. Unlike other approaches, we attempt to define the meanings of text fragments in a composable and computer interpretable way. We discuss models and ideas for detecting different types of semantic incomprehension and choosing the interpretation that makes most sense in a given context. Knowledge representation is designed for handling context-sensitive and uncertain / imprecise knowledge, and for easy accommodation of new information. It stores quantitative information capturing the essence of the concepts, because it is crucial for working with natural language understanding and reasoning. Still, the representation is general enough to allow for new knowledge to be learned, and even generated by the system. The article concludes by discussing some reasoning-related topics: possible approaches to generation of new abstract concepts, and describing situations and concepts in words (e.g. for specifying interpretation difficulties).

* 26 pages 

  Access Paper or Ask Questions