Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

A survey of top-down approaches for human pose estimation

Feb 05, 2022
Thong Duy Nguyen, Milan Kresovic

Human pose estimation in two-dimensional images videos has been a hot topic in the computer vision problem recently due to its vast benefits and potential applications for improving human life, such as behaviors recognition, motion capture and augmented reality, training robots, and movement tracking. Many state-of-the-art methods implemented with Deep Learning have addressed several challenges and brought tremendous remarkable results in the field of human pose estimation. Approaches are classified into two kinds: the two-step framework (top-down approach) and the part-based framework (bottom-up approach). While the two-step framework first incorporates a person detector and then estimates the pose within each box independently, detecting all body parts in the image and associating parts belonging to distinct persons is conducted in the part-based framework. This paper aims to provide newcomers with an extensive review of deep learning methods-based 2D images for recognizing the pose of people, which only focuses on top-down approaches since 2016. The discussion through this paper presents significant detectors and estimators depending on mathematical background, the challenges and limitations, benchmark datasets, evaluation metrics, and comparison between methods.

* 16 pages 

  Access Paper or Ask Questions

RGB-D SLAM Using Attention Guided Frame Association

Jan 28, 2022
Ali Caglayan, Nevrez Imamoglu, Oguzhan Guclu, Ali Osman Serhatoglu, Weimin Wang, Ahmet Burak Can, Ryosuke Nakamura

Deep learning models as an emerging topic have shown great progress in various fields. Especially, visualization tools such as class activation mapping methods provided visual explanation on the reasoning of convolutional neural networks (CNNs). By using the gradients of the network layers, it is possible to demonstrate where the networks pay attention during a specific image recognition task. Moreover, these gradients can be integrated with CNN features for localizing more generalized task dependent attentive (salient) objects in scenes. Despite this progress, there is not much explicit usage of this gradient (network attention) information to integrate with CNN representations for object semantics. This can be very useful for visual tasks such as simultaneous localization and mapping (SLAM) where CNN representations of spatially attentive object locations may lead to improved performance. Therefore, in this work, we propose the use of task specific network attention for RGB-D indoor SLAM. To do so, we integrate layer-wise object attention information (layer gradients) with CNN layer representations to improve frame association performance in a state-of-the-art RGB-D indoor SLAM method. Experiments show promising initial results with improved performance.

* 5 pages, 3 figures, 1 table 

  Access Paper or Ask Questions

TPAD: Identifying Effective Trajectory Predictions Under the Guidance of Trajectory Anomaly Detection Model

Jan 09, 2022
Chunnan Wang, Chen Liang, Xiang Chen, Hongzhi Wang

Trajectory Prediction (TP) is an important research topic in computer vision and robotics fields. Recently, many stochastic TP models have been proposed to deal with this problem and have achieved better performance than the traditional models with deterministic trajectory outputs. However, these stochastic models can generate a number of future trajectories with different qualities. They are lack of self-evaluation ability, that is, to examine the rationality of their prediction results, thus failing to guide users to identify high-quality ones from their candidate results. This hinders them from playing their best in real applications. In this paper, we make up for this defect and propose TPAD, a novel TP evaluation method based on the trajectory Anomaly Detection (AD) technique. In TPAD, we firstly combine the Automated Machine Learning (AutoML) technique and the experience in the AD and TP field to automatically design an effective trajectory AD model. Then, we utilize the learned trajectory AD model to examine the rationality of the predicted trajectories, and screen out good TP results for users. Extensive experimental results demonstrate that TPAD can effectively identify near-optimal prediction results, improving stochastic TP models' practical application effect.

* 14 pages, 7 figures 

  Access Paper or Ask Questions

Deep Transfer Learning & Beyond: Transformer Language Models in Information Systems Research

Oct 23, 2021
Ross Gruetzemacher, David Paradice

AI is widely thought to be poised to transform business, yet current perceptions of the scope of this transformation may be myopic. Recent progress in natural language processing involving transformer language models (TLMs) offers a potential avenue for AI-driven business and societal transformation that is beyond the scope of what most currently foresee. We review this recent progress as well as recent literature utilizing text mining in top IS journals to develop an outline for how future IS research can benefit from these new techniques. Our review of existing IS literature reveals that suboptimal text mining techniques are prevalent and that the more advanced TLMs could be applied to enhance and increase IS research involving text data, and to enable new IS research topics, thus creating more value for the research community. This is possible because these techniques make it easier to develop very powerful custom systems and their performance is superior to existing methods for a wide range of tasks and applications. Further, multilingual language models make possible higher quality text analytics for research in multiple languages. We also identify new avenues for IS research, like language user interfaces, that may offer even greater potential for future IS research.

* Under review (revised once). Section 2, the literature review on deep transfer learning and transformer language models, is a valuable introduction for a broad audience (not just information systems researchers). 33 pages plus 13-page appendix 

  Access Paper or Ask Questions

On Optimal Interpolation In Linear Regression

Oct 21, 2021
Eduard Oravkin, Patrick Rebeschini

Understanding when and why interpolating methods generalize well has recently been a topic of interest in statistical learning theory. However, systematically connecting interpolating methods to achievable notions of optimality has only received partial attention. In this paper, we investigate the question of what is the optimal way to interpolate in linear regression using functions that are linear in the response variable (as the case for the Bayes optimal estimator in ridge regression) and depend on the data, the population covariance of the data, the signal-to-noise ratio and the covariance of the prior for the signal, but do not depend on the value of the signal itself nor the noise vector in the training data. We provide a closed-form expression for the interpolator that achieves this notion of optimality and show that it can be derived as the limit of preconditioned gradient descent with a specific initialization. We identify a regime where the minimum-norm interpolator provably generalizes arbitrarily worse than the optimal response-linear achievable interpolator that we introduce, and validate with numerical experiments that the notion of optimality we consider can be achieved by interpolating methods that only use the training data as input in the case of an isotropic prior. Finally, we extend the notion of optimal response-linear interpolation to random features regression under a linear data-generating model that has been previously studied in the literature.

* 25 pages, 7 figures, to appear in NeurIPS 2021 

  Access Paper or Ask Questions

Incremental Community Detection in Distributed Dynamic Graph

Oct 12, 2021
Tariq Abughofa, Ahmed A. Harby, Haruna Isah, Farhana Zulkernine

Community detection is an important research topic in graph analytics that has a wide range of applications. A variety of static community detection algorithms and quality metrics were developed in the past few years. However, most real-world graphs are not static and often change over time. In the case of streaming data, communities in the associated graph need to be updated either continuously or whenever new data streams are added to the graph, which poses a much greater challenge in devising good community detection algorithms for maintaining dynamic graphs over streaming data. In this paper, we propose an incremental community detection algorithm for maintaining a dynamic graph over streaming data. The contributions of this study include (a) the implementation of a Distributed Weighted Community Clustering (DWCC) algorithm, (b) the design and implementation of a novel Incremental Distributed Weighted Community Clustering (IDWCC) algorithm, and (c) an experimental study to compare the performance of our IDWCC algorithm with the DWCC algorithm. We validate the functionality and efficiency of our framework in processing streaming data and performing large in-memory distributed dynamic graph analytics. The results demonstrate that our IDWCC algorithm performs up to three times faster than the DWCC algorithm for a similar accuracy.

* BigDataService 2021 best paper award 

  Access Paper or Ask Questions

A Comparative Study of Sentiment Analysis Using NLP and Different Machine Learning Techniques on US Airline Twitter Data

Oct 02, 2021
Md. Taufiqul Haque Khan Tusar, Md. Touhidul Islam

Today's business ecosystem has become very competitive. Customer satisfaction has become a major focus for business growth. Business organizations are spending a lot of money and human resources on various strategies to understand and fulfill their customer's needs. But, because of defective manual analysis on multifarious needs of customers, many organizations are failing to achieve customer satisfaction. As a result, they are losing customer's loyalty and spending extra money on marketing. We can solve the problems by implementing Sentiment Analysis. It is a combined technique of Natural Language Processing (NLP) and Machine Learning (ML). Sentiment Analysis is broadly used to extract insights from wider public opinion behind certain topics, products, and services. We can do it from any online available data. In this paper, we have introduced two NLP techniques (Bag-of-Words and TF-IDF) and various ML classification algorithms (Support Vector Machine, Logistic Regression, Multinomial Naive Bayes, Random Forest) to find an effective approach for Sentiment Analysis on a large, imbalanced, and multi-classed dataset. Our best approaches provide 77% accuracy using Support Vector Machine and Logistic Regression with Bag-of-Words technique.

* 4 pages, 2 figures, Presented in the Proceeding of the International Conference on Electronics, Communications and Information Technology (ICECIT), 14-16 September 2021 

  Access Paper or Ask Questions

Introducing an Abusive Language Classification Framework for Telegram to Investigate the German Hater Community

Sep 15, 2021
Maximilian Wich, Adrian Gorniak, Tobias Eder, Daniel Bartmann, Burak Enes Çakici, Georg Groh

Since traditional social media platforms ban more and more actors that distribute hate speech or other forms of abusive language (deplatforming), these actors migrate to alternative platforms that do not moderate the users' content. One known platform that is relevant for the German hater community is Telegram, for which there have only been made limited research efforts so far. The goal of this study is to develop a broad framework that consists of (i) an abusive language classification model for German Telegram messages and (ii) a classification model for the hatefulness of Telegram channels. For the first part, we employ existing abusive language datasets containing posts from other platforms to build our classification models. For the channel classification model, we develop a method that combines channel specific content information coming from a topic model with a social graph to predict the hatefulness of channels. Furthermore, we complement these two approaches for hate speech detection with insightful results on the evolution of the hater community on Telegram in Germany. Moreover, we propose methods to the hate speech research community for scalable network analyses for social media platforms. As an additional output of the study, we release an annotated abusive language dataset containing 1,149 annotated Telegram messages.

  Access Paper or Ask Questions

DivergentNets: Medical Image Segmentation by Network Ensemble

Jul 01, 2021
Vajira Thambawita, Steven A. Hicks, Pål Halvorsen, Michael A. Riegler

Detection of colon polyps has become a trending topic in the intersecting fields of machine learning and gastrointestinal endoscopy. The focus has mainly been on per-frame classification. More recently, polyp segmentation has gained attention in the medical community. Segmentation has the advantage of being more accurate than per-frame classification or object detection as it can show the affected area in greater detail. For our contribution to the EndoCV 2021 segmentation challenge, we propose two separate approaches. First, a segmentation model named TriUNet composed of three separate UNet models. Second, we combine TriUNet with an ensemble of well-known segmentation models, namely UNet++, FPN, DeepLabv3, and DeepLabv3+, into a model called DivergentNets to produce more generalizable medical image segmentation masks. In addition, we propose a modified Dice loss that calculates loss only for a single class when performing multiclass segmentation, forcing the model to focus on what is most important. Overall, the proposed methods achieved the best average scores for each respective round in the challenge, with TriUNet being the winning model in Round I and DivergentNets being the winning model in Round II of the segmentation generalization challenge at EndoCV 2021. The implementation of our approach is made publicly available on GitHub.

* Proceedings of the 3rd International Workshop and Challenge on Computer Vision in Endoscopy (EndoCV 2021) colocated with with the 17th IEEE International Symposium on Biomedical Imaging (ISBI 2021) 
* the winning model of the segmentation generalization challenge at EndoCV 2021 

  Access Paper or Ask Questions