Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Towards Hardware Implementation of Neural Network-based Communication Algorithms

Feb 19, 2019
Fayçal Ait Aoudia, Jakob Hoydis

There is a recent interest in neural network (NN)-based communication algorithms which have shown to achieve (beyond) state-of-the-art performance for a variety of problems or lead to reduced implementation complexity. However, most work on this topic is simulation based and implementation on specialized hardware for fast inference, such as field-programmable gate arrays (FPGAs), is widely ignored. In particular for practical uses, NN weights should be quantized and inference carried out by a fixed-point instead of floating-point system, widely used in consumer class computers and graphics processing units (GPUs). Moving to such representations enables higher inference rates and complexity reductions, at the cost of precision loss. We demonstrate that it is possible to implement NN-based algorithms in fixed-point arithmetic with quantized weights at negligible performance loss and with hardware complexity compatible with practical systems, such as FPGAs and application-specific integrated circuits (ASICs).

  Access Paper or Ask Questions

Cross-Modal and Hierarchical Modeling of Video and Text

Oct 16, 2018
Bowen Zhang, Hexiang Hu, Fei Sha

Visual data and text data are composed of information at multiple granularities. A video can describe a complex scene that is composed of multiple clips or shots, where each depicts a semantically coherent event or action. Similarly, a paragraph may contain sentences with different topics, which collectively conveys a coherent message or story. In this paper, we investigate the modeling techniques for such hierarchical sequential data where there are correspondences across multiple modalities. Specifically, we introduce hierarchical sequence embedding (HSE), a generic model for embedding sequential data of different modalities into hierarchically semantic spaces, with either explicit or implicit correspondence information. We perform empirical studies on large-scale video and paragraph retrieval datasets and demonstrated superior performance by the proposed methods. Furthermore, we examine the effectiveness of our learned embeddings when applied to downstream tasks. We show its utility in zero-shot action recognition and video captioning.

* Accepted by ECCV 2018 

  Access Paper or Ask Questions

ADVISE: Symbolism and External Knowledge for Decoding Advertisements

Jul 29, 2018
Keren Ye, Adriana Kovashka

In order to convey the most content in their limited space, advertisements embed references to outside knowledge via symbolism. For example, a motorcycle stands for adventure (a positive property the ad wants associated with the product being sold), and a gun stands for danger (a negative property to dissuade viewers from undesirable behaviors). We show how to use symbolic references to better understand the meaning of an ad. We further show how anchoring ad understanding in general-purpose object recognition and image captioning improves results. We formulate the ad understanding task as matching the ad image to human-generated statements that describe the action that the ad prompts, and the rationale it provides for taking this action. Our proposed method outperforms the state of the art on this task, and on an alternative formulation of question-answering on ads. We show additional applications of our learned representations for matching ads to slogans, and clustering ads according to their topic, without extra training.

* To appear, Proceedings of the European Conference on Computer Vision (ECCV) 

  Access Paper or Ask Questions

A Portuguese Native Language Identification Dataset

Apr 30, 2018
Iria del Río, Marcos Zampieri, Shervin Malmasi

In this paper we present NLI-PT, the first Portuguese dataset compiled for Native Language Identification (NLI), the task of identifying an author's first language based on their second language writing. The dataset includes 1,868 student essays written by learners of European Portuguese, native speakers of the following L1s: Chinese, English, Spanish, German, Russian, French, Japanese, Italian, Dutch, Tetum, Arabic, Polish, Korean, Romanian, and Swedish. NLI-PT includes the original student text and four different types of annotation: POS, fine-grained POS, constituency parses, and dependency parses. NLI-PT can be used not only in NLI but also in research on several topics in the field of Second Language Acquisition and educational NLP. We discuss possible applications of this dataset and present the results obtained for the first lexical baseline system for Portuguese NLI.

* Proceedings of The 13th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) 

  Access Paper or Ask Questions

Linking Tweets with Monolingual and Cross-Lingual News using Transformed Word Embeddings

Oct 25, 2017
Aditya Mogadala, Dominik Jung, Achim Rettinger

Social media platforms have grown into an important medium to spread information about an event published by the traditional media, such as news articles. Grouping such diverse sources of information that discuss the same topic in varied perspectives provide new insights. But the gap in word usage between informal social media content such as tweets and diligently written content (e.g. news articles) make such assembling difficult. In this paper, we propose a transformation framework to bridge the word usage gap between tweets and online news articles across languages by leveraging their word embeddings. Using our framework, word embeddings extracted from tweets and news articles are aligned closer to each other across languages, thus facilitating the identification of similarity between news articles and tweets. Experimental results show a notable improvement over baselines for monolingual tweets and news articles comparison, while new findings are reported for cross-lingual comparison.

* Presented at CICLing 2017 (18th International Conference on Intelligent Text Processing and Computational Linguistics). To appear in International Journal of Computational Linguistics and Applications (IJLCA) 

  Access Paper or Ask Questions

Deeply Aggregated Alternating Minimization for Image Restoration

Dec 20, 2016
Youngjung Kim, Hyungjoo Jung, Dongbo Min, Kwanghoon Sohn

Regularization-based image restoration has remained an active research topic in computer vision and image processing. It often leverages a guidance signal captured in different fields as an additional cue. In this work, we present a general framework for image restoration, called deeply aggregated alternating minimization (DeepAM). We propose to train deep neural network to advance two of the steps in the conventional AM algorithm: proximal mapping and ?- continuation. Both steps are learned from a large dataset in an end-to-end manner. The proposed framework enables the convolutional neural networks (CNNs) to operate as a prior or regularizer in the AM algorithm. We show that our learned regularizer via deep aggregation outperforms the recent data-driven approaches as well as the nonlocalbased methods. The flexibility and effectiveness of our framework are demonstrated in several image restoration tasks, including single image denoising, RGB-NIR restoration, and depth super-resolution.

* 9 PAGES 

  Access Paper or Ask Questions

Neural Contextual Conversation Learning with Labeled Question-Answering Pairs

Jul 20, 2016
Kun Xiong, Anqi Cui, Zefeng Zhang, Ming Li

Neural conversational models tend to produce generic or safe responses in different contexts, e.g., reply \textit{"Of course"} to narrative statements or \textit{"I don't know"} to questions. In this paper, we propose an end-to-end approach to avoid such problem in neural generative models. Additional memory mechanisms have been introduced to standard sequence-to-sequence (seq2seq) models, so that context can be considered while generating sentences. Three seq2seq models, which memorize a fix-sized contextual vector from hidden input, hidden input/output and a gated contextual attention structure respectively, have been trained and tested on a dataset of labeled question-answering pairs in Chinese. The model with contextual attention outperforms others including the state-of-the-art seq2seq models on perplexity test. The novel contextual model generates diverse and robust responses, and is able to carry out conversations on a wide range of topics appropriately.

  Access Paper or Ask Questions

Spatio-Temporal Image Boundary Extrapolation

May 24, 2016
Apratim Bhattacharyya, Mateusz Malinowski, Mario Fritz

Boundary prediction in images as well as video has been a very active topic of research and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception. While prior work has focused on predicting boundaries for observed frames, our work aims at predicting boundaries of future unobserved frames. This requires our model to learn about the fate of boundaries and extrapolate motion patterns. We experiment on established real-world video segmentation dataset, which provides a testbed for this new task. We show for the first time spatio-temporal boundary extrapolation in this challenging scenario. Furthermore, we show long-term prediction of boundaries in situations where the motion is governed by the laws of physics. We successfully predict boundaries in a billiard scenario without any assumptions of a strong parametric model or any object notion. We argue that our model has with minimalistic model assumptions derived a notion of 'intuitive physics' that can be applied to novel scenes.

  Access Paper or Ask Questions

A Novel Approach Towards Clustering Based Image Segmentation

Jun 04, 2015
Dibya Jyoti Bora, Anil Kumar Gupta

In computer vision, image segmentation is always selected as a major research topic by researchers. Due to its vital rule in image processing, there always arises the need of a better image segmentation method. Clustering is an unsupervised study with its application in almost every field of science and engineering. Many researchers used clustering in image segmentation process. But still there requires improvement of such approaches. In this paper, a novel approach for clustering based image segmentation is proposed. Here, we give importance on color space and choose lab for this task. The famous hard clustering algorithm K-means is used, but as its performance is dependent on choosing a proper distance measure, so, we go for cosine distance measure. Then the segmented image is filtered with sobel filter. The filtered image is analyzed with marker watershed algorithm to have the final segmented result of our original image. The MSE and PSNR values are evaluated to observe the performance.

* 5 pages, 7 figures, 1 table in International Journal of Emerging Science and Engineering, Volume-2 Issue-11, September 2014. arXiv admin note: text overlap with arXiv:1506.01472 

  Access Paper or Ask Questions

Analyzing the Language of Food on Social Media

Sep 11, 2014
Daniel Fried, Mihai Surdeanu, Stephen Kobourov, Melanie Hingle, Dane Bell

We investigate the predictive power behind the language of food on social media. We collect a corpus of over three million food-related posts from Twitter and demonstrate that many latent population characteristics can be directly predicted from this data: overweight rate, diabetes rate, political leaning, and home geographical location of authors. For all tasks, our language-based models significantly outperform the majority-class baselines. Performance is further improved with more complex natural language processing, such as topic modeling. We analyze which textual features have most predictive power for these datasets, providing insight into the connections between the language of food, geographic locale, and community characteristics. Lastly, we design and implement an online system for real-time query and visualization of the dataset. Visualization tools, such as geo-referenced heatmaps, semantics-preserving wordclouds and temporal histograms, allow us to discover more complex, global patterns mirrored in the language of food.

* An extended abstract of this paper will appear in IEEE Big Data 2014 

  Access Paper or Ask Questions