Mental health is a critical issue in the modern society, mental disorders could sometimes turn to suicidal ideation without effective treatment. Early detection of mental disorders and suicidal ideation from social content provides a potential way for effective social intervention. Classifying suicidal ideation and other mental disorders, however, is a challenging task as they share quite similar patterns in language usage and sentimental polarity. In this paper, we enhance text representation with lexicon-based sentiment scores and latent topics, and propose to use relation networks for detecting suicidal ideation and mental disorders with related risk indicators. The relation module is further equipped with the attention mechanism to prioritize more important relational features. Through experiments on three real-world datasets, our model outperforms most of its counterparts.
Generative adversarial networks (GANs) have achieved impressive results today, but not all generated images are perfect. A number of quantitative criteria have recently emerged for generative model, but none of them are designed for a single generated image. In this paper, we propose a new research topic, Generated Image Quality Assessment (GIQA), which quantitatively evaluates the quality of each generated image. We introduce three GIQA algorithms from two perspectives: learning-based and data-based. We evaluate a number of images generated by various recent GAN models on different datasets and demonstrate that they are consistent with human assessments. Furthermore, GIQA is available to many applications, like separately evaluating the realism and diversity of generative models, and enabling online hard negative mining (OHEM) in the training of GANs to improve the results.
In exchange for large quantities of data and processing power, deep neural networks have yielded models that provide state of the art predication capabilities in many fields. However, a lack of strong guarantees on their behaviour have raised concerns over their use in safety-critical applications. A first step to understanding these networks is to develop alternate representations that allow for further analysis. It has been shown that neural networks with piecewise affine activation functions are themselves piecewise affine, with their domains consisting of a vast number of linear regions. So far, the research on this topic has focused on counting the number of linear regions, rather than obtaining explicit piecewise affine representations. This work presents a novel algorithm that can compute the piecewise affine form of any fully connected neural network with rectified linear unit activations.
Human action recognition in video is an active yet challenging research topic due to high variation and complexity of data. In this paper, a novel video based action recognition framework utilizing complementary cues is proposed to handle this complex problem. Inspired by the successful two stream networks for action classification, additional pose features are studied and fused to enhance understanding of human action in a more abstract and semantic way. Towards practices, not only ground truth poses but also noisy estimated poses are incorporated in the framework with our proposed pre-processing module. The whole framework and each cue are evaluated on varied benchmarking datasets as JHMDB, sub-JHMDB and Penn Action. Our results outperform state-of-the-art performance on these datasets and show the strength of complementary cues.
Continuous symmetries and their breaking play a prominent role in contemporary physics. Effective low-energy field theories around symmetry breaking states explain diverse phenomena such as superconductivity, magnetism, and the mass of nucleons. We show that such field theories can also be a useful tool in machine learning, in particular for loss functions with continuous symmetries that are spontaneously broken by random initializations. In this paper, we illuminate our earlier published work (Bamler & Mandt, 2018) on this topic more from the perspective of theoretical physics. We show that the analogies between superconductivity and symmetry breaking in temporal representation learning are rather deep, allowing us to formulate a gauge theory of `charged' embedding vectors in time series models. We show that making the loss function gauge invariant speeds up convergence in such models.
A general information transmission model, under independent and identically distributed Gaussian codebook and nearest neighbor decoding rule with processed channel output, is investigated using the performance metric of generalized mutual information. When the encoder and the decoder know the statistical channel model, it is found that the optimal channel output processing function is the conditional expectation operator, thus hinting a potential role of regression, a classical topic in machine learning, for this model. Without utilizing the statistical channel model, a problem formulation inspired by machine learning principles is established, with suitable performance metrics introduced. A data-driven inference algorithm is proposed to solve the problem, and the effectiveness of the algorithm is validated via numerical experiments. Extensions to more general information transmission models are also discussed.
The problem of allocating students to supervisors for the development of a personal project or a dissertation is a crucial activity in the higher education environment, as it enables students to get feedback on their work from an expert and improve their personal, academic, and professional abilities. In this article, we propose a multi-objective and near Pareto optimal genetic algorithm for the allocation of students to supervisors. The allocation takes into consideration the students and supervisors' preferences on research/project topics, the lower and upper supervision quotas of supervisors, as well as the workload balance amongst supervisors. We introduce novel mutation and crossover operators for the student-supervisor allocation problem. The experiments carried out show that the components of the genetic algorithm are more apt for the problem than classic components, and that the genetic algorithm is capable of producing allocations that are near Pareto optimal in a reasonable time.
Artificial Intelligence principles define social and ethical considerations to develop future AI. They come from research institutes, government organizations and industries. All versions of AI principles are with different considerations covering different perspectives and making different emphasis. None of them can be considered as complete and can cover the rest AI principle proposals. Here we introduce LAIP, an effort and platform for linking and analyzing different Artificial Intelligence Principles. We want to explicitly establish the common topics and links among AI Principles proposed by different organizations and investigate on their uniqueness. Based on these efforts, for the long-term future of AI, instead of directly adopting any of the AI principles, we argue for the necessity of incorporating various AI Principles into a comprehensive framework and focusing on how they can interact and complete each other.
Tactile sensors supply useful information during the interaction with an object that can be used for assessing the stability of a grasp. Most of the previous works on this topic processed tactile readings as signals by calculating hand-picked features. Some of them have processed these readings as images calculating characteristics on matrix-like sensors. In this work, we explore how non-matrix sensors (sensors with taxels not arranged exactly in a matrix) can be processed as tactile images as well. In addition, we prove that they can be used for predicting grasp stability by training a Convolutional Neural Network (CNN) with them. We captured over 2500 real three-fingered grasps on 41 everyday objects to train a CNN that exploited the local connectivity inherent on the non-matrix tactile sensors, achieving 94.2% F1-score on predicting stability.
Human lives are important. The decision to allow self-driving vehicles operate on our roads carries great weight. This has been a hot topic of debate between policy-makers, technologists and public safety institutions. The recent Uber Inc. self-driving car crash, resulting in the death of a pedestrian, has strengthened the argument that autonomous vehicle technology is still not ready for deployment on public roads. In this work, we analyze the Uber car crash and shed light on the question, "Could the Uber Car Crash have been avoided?". We apply state-of-the-art Computer Vision models to this highly practical scenario. More generally, our experimental results are an evaluation of various image enhancement and object recognition techniques for enabling pedestrian safety in low-lighting conditions using the Uber crash as a case study.