Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Identifying similarity and anomalies for cryptocurrency moments and distribution extremities

Jan 26, 2020
Nick James, Max Menzies, Jennifer Chan

We propose two new methods for identifying similarity and anomalies among collections of time series, and apply these methods to analyse cryptocurrencies. First, we analyse change points with respect to various distribution moments, considering these points as signals of erratic behaviour and potential risk. This technique uses the MJ$_1$ semi-metric, from the more general MJ$_p$ class of semi-metrics \citep{James2019}, to measure distance between these change point sets. Prior work on this topic fails to consider data between change points, and in particular, does not justify the utility of this change point analysis. Therefore, we introduce a second method to determine similarity between time series, in this instance with respect to their extreme values, or tail behaviour. Finally, we measure the consistency between our two methods, that is, structural break versus tail behaviour similarity. With cryptocurrency investment as an apt example of erratic, extreme behaviour, we notice an impressive consistency between these two methods.

* Substantial additions compared to v1. New method and analysis 

  Access Paper or Ask Questions

A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications

Jan 20, 2020
Jie Gui, Zhenan Sun, Yonggang Wen, Dacheng Tao, Jieping Ye

Generative adversarial networks (GANs) are a hot research topic recently. GANs have been widely studied since 2014, and a large number of algorithms have been proposed. However, there is few comprehensive study explaining the connections among different GANs variants, and how they have evolved. In this paper, we attempt to provide a review on various GANs methods from the perspectives of algorithms, theory, and applications. Firstly, the motivations, mathematical representations, and structure of most GANs algorithms are introduced in details. Furthermore, GANs have been combined with other machine learning algorithms for specific applications, such as semi-supervised learning, transfer learning, and reinforcement learning. This paper compares the commonalities and differences of these GANs methods. Secondly, theoretical issues related to GANs are investigated. Thirdly, typical applications of GANs in image processing and computer vision, natural language processing, music, speech and audio, medical field, and data science are illustrated. Finally, the future open research problems for GANs are pointed out.

  Access Paper or Ask Questions

Artificial Intelligence for Social Good: A Survey

Jan 07, 2020
Zheyuan Ryan Shi, Claire Wang, Fei Fang

Artificial intelligence for social good (AI4SG) is a research theme that aims to use and advance artificial intelligence to address societal issues and improve the well-being of the world. AI4SG has received lots of attention from the research community in the past decade with several successful applications. Building on the most comprehensive collection of the AI4SG literature to date with over 1000 contributed papers, we provide a detailed account and analysis of the work under the theme in the following ways. (1) We quantitatively analyze the distribution and trend of the AI4SG literature in terms of application domains and AI techniques used. (2) We propose three conceptual methods to systematically group the existing literature and analyze the eight AI4SG application domains in a unified framework. (3) We distill five research topics that represent the common challenges in AI4SG across various application domains. (4) We discuss five issues that, we hope, can shed light on the future development of the AI4SG research.

  Access Paper or Ask Questions

Improving Entity Linking by Modeling Latent Entity Type Information

Jan 06, 2020
Shuang Chen, Jinpeng Wang, Feng Jiang, Chin-Yew Lin

Existing state of the art neural entity linking models employ attention-based bag-of-words context model and pre-trained entity embeddings bootstrapped from word embeddings to assess topic level context compatibility. However, the latent entity type information in the immediate context of the mention is neglected, which causes the models often link mentions to incorrect entities with incorrect type. To tackle this problem, we propose to inject latent entity type information into the entity embeddings based on pre-trained BERT. In addition, we integrate a BERT-based entity similarity score into the local context model of a state-of-the-art model to better capture latent entity type information. Our model significantly outperforms the state-of-the-art entity linking models on standard benchmark (AIDA-CoNLL). Detailed experiment analysis demonstrates that our model corrects most of the type errors produced by the direct baseline.

* Accepted by AAAI 2020 

  Access Paper or Ask Questions

A Distributed Fair Machine Learning Framework with Private Demographic Data Protection

Sep 17, 2019
Hui Hu, Yijun Liu, Zhen Wang, Chao Lan

Fair machine learning has become a significant research topic with broad societal impact. However, most fair learning methods require direct access to personal demographic data, which is increasingly restricted to use for protecting user privacy (e.g. by the EU General Data Protection Regulation). In this paper, we propose a distributed fair learning framework for protecting the privacy of demographic data. We assume this data is privately held by a third party, which can communicate with the data center (responsible for model development) without revealing the demographic information. We propose a principled approach to design fair learning methods under this framework, exemplify four methods and show they consistently outperform their existing counterparts in both fairness and accuracy across three real-world data sets. We theoretically analyze the framework, and prove it can learn models with high fairness or high accuracy, with their trade-offs balanced by a threshold variable.

* 9 pages,4 figures,International Conference of Data Mining 

  Access Paper or Ask Questions

Cross-Enhancement Transform Two-Stream 3D ConvNets for Pedestrian Action Recognition of Autonomous Vehicles

Aug 19, 2019
Dong Cao, Lisha Xu

Action recognition is an important research topic in machine vision. It is widely used in many fields and is one of the key technologies in pedestrian behavior recognition and intention prediction in the field of autonomous driving. Based on the widely used 3D ConvNets algorithm, combined with Two-Stream Inflated algorithm and transfer learning algorithm, we construct a Cross-Enhancement Transform based Two-Stream 3D ConvNets algorithm. On the datasets with different data distribution characteristics, the performance of the algorithm is different, especially the performance of the RGB and optical flow stream in the two stream is different. For this case, we combine the data distribution characteristics on the specific dataset. As a teaching model, the stream with better performance in the two stream is used to assist in training another stream, and then two stream inference is made. We conducted experiments on the UCF-101, HMDB-51, and Kinetics data sets, and the experimental results confirmed the effectiveness of our algorithm.

* Accepted for publication in AIIPCC 2019 

  Access Paper or Ask Questions

Learning while Competing -- 3D Modeling & Design

May 18, 2019
Kalind Karia, Rucmenya Bessariya, Krishna Lala, Kavi Arya

The e-Yantra project at IIT Bombay conducts an online competition, e-Yantra Robotics Competition (eYRC) which uses a Project Based Learning (PBL) methodology to train students to implement a robotics project in a step-by-step manner over a five-month period. Participation is absolutely free. The competition provides all resources - robot, accessories, and a problem statement - to a participating team. If selected for the finals, e-Yantra pays for them to come to the finals at IIT Bombay. This makes the competition accessible to resource-poor student teams. In this paper, we describe the methodology used in the 6th edition of eYRC, eYRC-2017 where we experimented with a Theme (projects abstracted into rulebooks) involving an advanced topic - 3D Designing and interfacing with sensors and actuators. We demonstrate that the learning outcomes are consistent with our previous studies [1]. We infer that even 3D designing to create a working model can be effectively learned in a competition mode through PBL.

  Access Paper or Ask Questions

Applications of Social Media in Hydroinformatics: A Survey

May 01, 2019
Yufeng Yu, Yuelong Zhu, Dingsheng Wan, Qun Zhao, Kai Shu, Huan Liu

Floods of research and practical applications employ social media data for a wide range of public applications, including environmental monitoring, water resource managing, disaster and emergency response.Hydroinformatics can benefit from the social media technologies with newly emerged data, techniques and analytical tools to handle large datasets, from which creative ideas and new values could be mined.This paper first proposes a 4W (What, Why, When, hoW) model and a methodological structure to better understand and represent the application of social media to hydroinformatics, then provides an overview of academic research of applying social media to hydroinformatics such as water environment, water resources, flood, drought and water Scarcity management. At last,some advanced topics and suggestions of water related social media applications from data collection, data quality management, fake news detection, privacy issues, algorithms and platforms was present to hydroinformatics managers and researchers based on previous discussion.

* 37pages 

  Access Paper or Ask Questions

Deep Learning to Predict Student Outcomes

Apr 27, 2019
Byung-Hak Kim

The increasingly fast development cycle for online course contents, along with the diverse student demographics in each online classroom, make real-time student outcomes prediction an interesting topic for both industrial research and practical needs. In this paper, we tackle the problem of real-time student performance prediction in an on-going course using a domain adaptation framework. This framework is a system trained on labeled student outcome data from previous coursework but is meant to be deployed on another course. In particular, we introduce a GritNet architecture, and develop an unsupervised domain adaptation method to transfer a GritNet trained on a past course to a new course without any student outcome label. Our results for real Udacity student graduation predictions show that the GritNet not only generalizes well from one course to another across different Nanodegree programs, but also enhances real-time predictions explicitly in the first few weeks when accurate predictions are most challenging.

* Accepted as oral presentation to ICLR 2019, AI for Social Good Workshop. arXiv admin note: substantial text overlap with arXiv:1809.06686, arXiv:1804.07405 

  Access Paper or Ask Questions

Condition-Transforming Variational AutoEncoder for Conversation Response Generation

Apr 24, 2019
Yu-Ping Ruan, Zhen-Hua Ling, Quan Liu, Zhigang Chen, Nitin Indurkhya

This paper proposes a new model, called condition-transforming variational autoencoder (CTVAE), to improve the performance of conversation response generation using conditional variational autoencoders (CVAEs). In conventional CVAEs , the prior distribution of latent variable z follows a multivariate Gaussian distribution with mean and variance modulated by the input conditions. Previous work found that this distribution tends to become condition independent in practical application. In our proposed CTVAE model, the latent variable z is sampled by performing a non-lineartransformation on the combination of the input conditions and the samples from a condition-independent prior distribution N (0; I). In our objective evaluations, the CTVAE model outperforms the CVAE model on fluency metrics and surpasses a sequence-to-sequence (Seq2Seq) model on diversity metrics. In subjective preference tests, our proposed CTVAE model performs significantly better than CVAE and Seq2Seq models on generating fluency, informative and topic relevant responses.

* ICASSP 2019, oral 

  Access Paper or Ask Questions