Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Structure preserving deep learning

Jun 05, 2020
Elena Celledoni, Matthias J. Ehrhardt, Christian Etmann, Robert I McLachlan, Brynjulf Owren, Carola-Bibiane Schönlieb, Ferdia Sherry

Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff between computational effort, amount of data and model complexity is required to successfully design a deep learning approach for a given problem. A large amount of progress made in deep learning has been based on heuristic explorations, but there is a growing effort to mathematically understand the structure in existing deep learning methods and to systematically design new deep learning methods to preserve certain types of structure in deep learning. In this article, we review a number of these directions: some deep neural networks can be understood as discretisations of dynamical systems, neural networks can be designed to have desirable properties such as invertibility or group equivariance, and new algorithmic frameworks based on conformal Hamiltonian systems and Riemannian manifolds to solve the optimisation problems have been proposed. We conclude our review of each of these topics by discussing some open problems that we consider to be interesting directions for future research.

  Access Paper or Ask Questions

COVID-19 Public Opinion and Emotion Monitoring System Based on Time Series Thermal New Word Mining

May 23, 2020
Yixian Zhang, Jieren Chen, Boyi Liu, Yifan Yang, Haocheng Li, Xinyi Zheng, Xi Chen, Tenglong Ren, Naixue Xiong

With the spread and development of new epidemics, it is of great reference value to identify the changing trends of epidemics in public emotions. We designed and implemented the COVID-19 public opinion monitoring system based on time series thermal new word mining. A new word structure discovery scheme based on the timing explosion of network topics and a Chinese sentiment analysis method for the COVID-19 public opinion environment is proposed. Establish a "Scrapy-Redis-Bloomfilter" distributed crawler framework to collect data. The system can judge the positive and negative emotions of the reviewer based on the comments, and can also reflect the depth of the seven emotions such as Hopeful, Happy, and Depressed. Finally, we improved the sentiment discriminant model of this system and compared the sentiment discriminant error of COVID-19 related comments with the Jiagu deep learning model. The results show that our model has better generalization ability and smaller discriminant error. We designed a large data visualization screen, which can clearly show the trend of public emotions, the proportion of various emotion categories, keywords, hot topics, etc., and fully and intuitively reflect the development of public opinion.

  Access Paper or Ask Questions

Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-based Detection

Jul 22, 2019
David Ifeoluwa Adelani, Haotian Mai, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen

Advanced neural language models (NLMs) are widely used in sequence generation tasks because they are able to produce fluent and meaningful sentences. They can also be used to generate fake reviews, which can then be used to attack online review systems and influence the buying decisions of online shoppers. A problem in fake review generation is how to generate the desired sentiment/topic. Existing solutions first generate an initial review based on some keywords and then modify some of the words in the initial review so that the review has the desired sentiment/topic. We overcome this problem by using the GPT-2 NLM to generate a large number of high-quality reviews based on a review with the desired sentiment and then using a BERT based text classifier (with accuracy of 96\%) to filter out reviews with undesired sentiments. Because none of the words in the review are modified, fluent samples like the training data can be generated from the learned distribution. A subjective evaluation with 80 participants demonstrated that this simple method can produce reviews that are as fluent as those written by people. It also showed that the participants tended to distinguish fake reviews randomly. Two countermeasures, GROVER and GLTR, were found to be able to accurately detect fake review.

* Submitted to the IEEE International Workshop on Information Forensics and Security (WIFS) 

  Access Paper or Ask Questions

Synthetically Trained Icon Proposals for Parsing and Summarizing Infographics

Jul 27, 2018
Spandan Madan, Zoya Bylinskii, Matthew Tancik, Adrià Recasens, Kimberli Zhong, Sami Alsheikh, Hanspeter Pfister, Aude Oliva, Fredo Durand

Widely used in news, business, and educational media, infographics are handcrafted to effectively communicate messages about complex and often abstract topics including `ways to conserve the environment' and `understanding the financial crisis'. Composed of stylistically and semantically diverse visual and textual elements, infographics pose new challenges for computer vision. While automatic text extraction works well on infographics, computer vision approaches trained on natural images fail to identify the stand-alone visual elements in infographics, or `icons'. To bridge this representation gap, we propose a synthetic data generation strategy: we augment background patches in infographics from our Visually29K dataset with Internet-scraped icons which we use as training data for an icon proposal mechanism. On a test set of 1K annotated infographics, icons are located with 38% precision and 34% recall (the best model trained with natural images achieves 14% precision and 7% recall). Combining our icon proposals with icon classification and text extraction, we present a multi-modal summarization application. Our application takes an infographic as input and automatically produces text tags and visual hashtags that are textually and visually representative of the infographic's topics respectively.

  Access Paper or Ask Questions

Real-World Single Image Super-Resolution: A Brief Review

Mar 03, 2021
Honggang Chen, Xiaohai He, Linbo Qing, Yuanyuan Wu, Chao Ren, Ce Zhu

Single image super-resolution (SISR), which aims to reconstruct a high-resolution (HR) image from a low-resolution (LR) observation, has been an active research topic in the area of image processing in recent decades. Particularly, deep learning-based super-resolution (SR) approaches have drawn much attention and have greatly improved the reconstruction performance on synthetic data. Recent studies show that simulation results on synthetic data usually overestimate the capacity to super-resolve real-world images. In this context, more and more researchers devote themselves to develop SR approaches for realistic images. This article aims to make a comprehensive review on real-world single image super-resolution (RSISR). More specifically, this review covers the critical publically available datasets and assessment metrics for RSISR, and four major categories of RSISR methods, namely the degradation modeling-based RSISR, image pairs-based RSISR, domain translation-based RSISR, and self-learning-based RSISR. Comparisons are also made among representative RSISR methods on benchmark datasets, in terms of both reconstruction quality and computational efficiency. Besides, we discuss challenges and promising research topics on RSISR.

* 18 pages, 12 figure, 4 tables 

  Access Paper or Ask Questions

A Survey on Theoretical Advances of Community Detection in Networks

Aug 26, 2018
Yunpeng Zhao

Real-world networks usually have community structure, that is, nodes are grouped into densely connected communities. Community detection is one of the most popular and best-studied research topics in network science and has attracted attention in many different fields, including computer science, statistics, social sciences, among others. Numerous approaches for community detection have been proposed in literature, from ad-hoc algorithms to systematic model-based approaches. The large number of available methods leads to a fundamental question: whether a certain method can provide consistent estimates of community labels. The stochastic blockmodel (SBM) and its variants provide a convenient framework for the study of such problems. This article is a survey on the recent theoretical advances of community detection. The authors review a number of community detection methods and their theoretical properties, including graph cut methods, profile likelihoods, the pseudo-likelihood method, the variational method, belief propagation, spectral clustering, and semidefinite relaxations of the SBM. The authors also briefly discuss other research topics in community detection such as robust community detection, community detection with nodal covariates and model selection, as well as suggest a few possible directions for future research.

* Wire Computational Statistics, 2017 

  Access Paper or Ask Questions

Prosody: The Rhythms and Melodies of Speech

Apr 27, 2017
Dafydd Gibbon

The present contribution is a tutorial on selected aspects of prosody, the rhythms and melodies of speech, based on a course of the same name at the Summer School on Contemporary Phonetics and Phonology at Tongji University, Shanghai, China in July 2016. The tutorial is not intended as an introduction to experimental methodology or as an overview of the literature on the topic, but as an outline of observationally accessible aspects of fundamental frequency and timing patterns with the aid of computational visualisation, situated in a semiotic framework of sign ranks and interpretations. After an informal introduction to the basic concepts of prosody in the introduction and a discussion of the place of prosody in the architecture of language, a selection of acoustic phonetic topics in phonemic tone and accent prosody, word prosody, phrasal prosody and discourse prosody are discussed, and a stylisation method for visualising aspects of prosody is introduced. Examples are taken from a number of typologically different languages: Anyi/Agni (Niger-Congo>Kwa, Ivory Coast), English, Kuki-Thadou (Sino-Tibetan, North-East India and Myanmar), Mandarin Chinese, Tem (Niger-Congo>Gur, Togo) and Farsi. The main focus is on fundamental frequency patterns, but issues of timing and rhythm are also discussed. In the final section, further reading and possible future research directions are outlined.

* 35 pages, 22 figures (2nd version at 

  Access Paper or Ask Questions

NetSpam: a Network-based Spam Detection Framework for Reviews in Online Social Media

Mar 10, 2017
Saeedreza Shehnepoor, Mostafa Salehi, Reza Farahbakhsh, Noel Crespi

Nowadays, a big part of people rely on available content in social media in their decisions (e.g. reviews and feedback on a topic or product). The possibility that anybody can leave a review provide a golden opportunity for spammers to write spam reviews about products and services for different interests. Identifying these spammers and the spam content is a hot topic of research and although a considerable number of studies have been done recently toward this end, but so far the methodologies put forth still barely detect spam reviews, and none of them show the importance of each extracted feature type. In this study, we propose a novel framework, named NetSpam, which utilizes spam features for modeling review datasets as heterogeneous information networks to map spam detection procedure into a classification problem in such networks. Using the importance of spam features help us to obtain better results in terms of different metrics experimented on real-world review datasets from Yelp and Amazon websites. The results show that NetSpam outperforms the existing methods and among four categories of features; including review-behavioral, user-behavioral, reviewlinguistic, user-linguistic, the first type of features performs better than the other categories.

  Access Paper or Ask Questions

Hotel Preference Rank based on Online Customer Review

Oct 10, 2021
Muhammad Apriandito Arya Saputra, Andry Alamsyah, Fajar Ibnu Fatihan

Topline hotels are now shifting into the digital way in how they understand their customers to maintain and ensuring satisfaction. Rather than the conventional way which uses written reviews or interviews, the hotel is now heavily investing in Artificial Intelligence particularly Machine Learning solutions. Analysis of online customer reviews changes the way companies make decisions in a more effective way than using conventional analysis. The purpose of this research is to measure hotel service quality. The proposed approach emphasizes service quality dimensions reviews of the top-5 luxury hotel in Indonesia that appear on the online travel site TripAdvisor based on section Best of 2018. In this research, we use a model based on a simple Bayesian classifier to classify each customer review into one of the service quality dimensions. Our model was able to separate each classification properly by accuracy, kappa, recall, precision, and F-measure measurements. To uncover latent topics in the customer's opinion we use Topic Modeling. We found that the common issue that occurs is about responsiveness as it got the lowest percentage compared to others. Our research provides a faster outlook of hotel rank based on service quality to end customers based on a summary of the previous online review.

* Test Engineering and Management, Vol. 83: March/April 2020 
* 5 pages, 6 figures, 5 tables 

  Access Paper or Ask Questions

Chronological Citation Recommendation with Time Preference

Jan 19, 2021
Shutian Ma, Heng Zhang, Chengzhi Zhang, Xiaozhong Liu

Citation recommendation is an important task to assist scholars in finding candidate literature to cite. Traditional studies focus on static models of recommending citations, which do not explicitly distinguish differences between papers that are caused by temporal variations. Although, some researchers have investigated chronological citation recommendation by adding time related function or modeling textual topics dynamically. These solutions can hardly cope with function generalization or cold-start problems when there is no information for user profiling or there are isolated papers never being cited. With the rise and fall of science paradigms, scientific topics tend to change and evolve over time. People would have the time preference when citing papers, since most of the theoretical basis exist in classical readings that published in old time, while new techniques are proposed in more recent papers. To explore chronological citation recommendation, this paper wants to predict the time preference based on user queries, which is a probability distribution of citing papers published in different time slices. Then, we use this time preference to re-rank the initial citation list obtained by content-based filtering. Experimental results demonstrate that task performance can be further enhanced by time preference and it's flexible to be added in other citation recommendation frameworks.

  Access Paper or Ask Questions