Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Survey on Data Collection for Machine Learning: a Big Data - AI Integration Perspective

Nov 08, 2018

Yuji Roh, Geon Heo, Steven Euijong Whang

Figure 1 for A Survey on Data Collection for Machine Learning: a Big Data - AI Integration Perspective

Figure 2 for A Survey on Data Collection for Machine Learning: a Big Data - AI Integration Perspective

Figure 3 for A Survey on Data Collection for Machine Learning: a Big Data - AI Integration Perspective

Figure 4 for A Survey on Data Collection for Machine Learning: a Big Data - AI Integration Perspective

Share this with someone who'll enjoy it:

Abstract:Data collection is a major bottleneck in machine learning and an active research topic in multiple communities. There are largely two reasons data collection has recently become a critical issue. First, as machine learning is becoming more widely-used, we are seeing new applications that do not necessarily have enough labeled data. Second, unlike traditional machine learning where feature engineering is the bottleneck, deep learning techniques automatically generate features, but instead require large amounts of labeled data. Interestingly, recent research in data collection comes not only from the machine learning, natural language, and computer vision communities, but also from the data management community due to the importance of handling large amounts of data. In this survey, we perform a comprehensive study of data collection from a data management point of view. Data collection largely consists of data acquisition, data labeling, and improvement of existing data or models. We provide a research landscape of these operations, provide guidelines on which technique to use when, and identify interesting research challenges. The integration of machine learning and data management for data collection is part of a larger trend of Big data and Artificial Intelligence (AI) integration and opens many opportunities for new research.

* 19 pages

View paper on

Share this with someone who'll enjoy it:

Title:A Survey on Data Collection for Machine Learning: a Big Data - AI Integration Perspective

Paper and Code