This article is an overview of supervised machine learning problems for regression and classification. Topics include: kernel methods, training by stochastic gradient descent, deep learning architecture, losses for classification, statistical learning theory, and dimension independent generalization bounds. Implicit regularization in deep learning examples are presented, including data augmentation, adversarial training, and additive noise. These methods are reframed as explicit gradient regularization.
Random Projection is a foundational research topic that connects a bunch of machine learning algorithms under a similar mathematical basis. It is used to reduce the dimensionality of the dataset by projecting the data points efficiently to a smaller dimensions while preserving the original relative distance between the data points. In this paper, we are intended to explain random projection method, by explaining its mathematical background and foundation, the applications that are currently adopting it, and an overview on its current research perspective.
Injury analysis may be one of the most beneficial applications of deep learning based human pose estimation. To facilitate further research on this topic, we provide an injury specific 2D dataset for alpine skiing, covering in total 533 images. We further propose a post processing routine, that combines rotational information with a simple kinematic model. We could improve detection results in fall situations by up to 21% regarding the [email protected] metric.
Opinion mining on social media posts has become more and more popular. Users often express their opinion on a topic not only with words but they also use image symbols such as emoticons and emoji. In this paper, we investigate the effect of emoji-based features in opinion classification of Uzbek texts, and more specifically movie review comments from YouTube. Several classification algorithms are tested, and feature ranking is performed to evaluate the discriminative ability of the emoji-based features.
This paper introduces a temporal framework for detecting and clustering emergent and viral topics on social networks. Endogenous and exogenous influence on developing viral content is explored using a clustering method based on the a user's behavior on social network and a dataset from Twitter API. Results are discussed by introducing metrics such as popularity, burstiness, and relevance score. The results show clear distinction in characteristics of developed content by the two classes of users.
This document summarizes the 4th International Workshop on Recovering 6D Object Pose which was organized in conjunction with ECCV 2018 in Munich. The workshop featured four invited talks, oral and poster presentations of accepted workshop papers, and an introduction of the BOP benchmark for 6D object pose estimation. The workshop was attended by 100+ people working on relevant topics in both academia and industry who shared up-to-date advances and discussed open problems.
We develop a high-quality multi-turn dialog dataset, DailyDialog, which is intriguing in several aspects. The language is human-written and less noisy. The dialogues in the dataset reflect our daily communication way and cover various topics about our daily life. We also manually label the developed dataset with communication intention and emotion information. Then, we evaluate existing approaches on DailyDialog dataset and hope it benefit the research field of dialog systems.
We present an automatic mortality prediction scheme based on the unstructured textual content of clinical notes. Proposing a convolutional document embedding approach, our empirical investigation using the MIMIC-III intensive care database shows significant performance gains compared to previously employed methods such as latent topic distributions or generic doc2vec embeddings. These improvements are especially pronounced for the difficult problem of post-discharge mortality prediction.
This paper addresses the question of how language use affects community reaction to comments in online discussion forums, and the relative importance of the message vs. the messenger. A new comment ranking task is proposed based on community annotated karma in Reddit discussions, which controls for topic and timing of comments. Experimental work with discussion threads from six subreddits shows that the importance of different types of language features varies with the community of interest.
Cuckoo search (CS) is a relatively new algorithm, developed by Yang and Deb in 2009, and CS is efficient in solving global optimization problems. In this paper, we review the fundamental ideas of cuckoo search and the latest developments as well as its applications. We analyze the algorithm and gain insight into its search mechanisms and find out why it is efficient. We also discuss the essence of algorithms and its link to self-organizing systems, and finally we propose some important topics for further research.