Alert button
Picture for Ting-Hsiang Wang

Ting-Hsiang Wang

Alert button

DivAug: Plug-in Automated Data Augmentation with Explicit Diversity Maximization

Mar 26, 2021
Zirui Liu, Haifeng Jin, Ting-Hsiang Wang, Kaixiong Zhou, Xia Hu

Figure 1 for DivAug: Plug-in Automated Data Augmentation with Explicit Diversity Maximization
Figure 2 for DivAug: Plug-in Automated Data Augmentation with Explicit Diversity Maximization
Figure 3 for DivAug: Plug-in Automated Data Augmentation with Explicit Diversity Maximization
Figure 4 for DivAug: Plug-in Automated Data Augmentation with Explicit Diversity Maximization

Human-designed data augmentation strategies have been replaced by automatically learned augmentation policy in the past two years. Specifically, recent work has empirically shown that the superior performance of the automated data augmentation methods stems from increasing the diversity of augmented data. However, two factors regarding the diversity of augmented data are still missing: 1) the explicit definition (and thus measurement) of diversity and 2) the quantifiable relationship between diversity and its regularization effects. To bridge this gap, we propose a diversity measure called Variance Diversity and theoretically show that the regularization effect of data augmentation is promised by Variance Diversity. We validate in experiments that the relative gain from automated data augmentation in test accuracy is highly correlated to Variance Diversity. An unsupervised sampling-based framework, DivAug, is designed to directly maximize Variance Diversity and hence strengthen the regularization effect. Without requiring a separate search process, the performance gain from DivAug is comparable with the state-of-the-art method with better efficiency. Moreover, under the semi-supervised setting, our framework can further improve the performance of semi-supervised learning algorithms when compared to RandAugment, making it highly applicable to real-world problems, where labeled data is scarce.

Viaarxiv icon

AutoRec: An Automated Recommender System

Jun 26, 2020
Ting-Hsiang Wang, Qingquan Song, Xiaotian Han, Zirui Liu, Haifeng Jin, Xia Hu

Figure 1 for AutoRec: An Automated Recommender System
Figure 2 for AutoRec: An Automated Recommender System
Figure 3 for AutoRec: An Automated Recommender System

Realistic recommender systems are often required to adapt to ever-changing data and tasks or to explore different models systematically. To address the need, we present AutoRec, an open-source automated machine learning (AutoML) platform extended from the TensorFlow ecosystem and, to our knowledge, the first framework to leverage AutoML for model search and hyperparameter tuning in deep recommendation models. AutoRec also supports a highly flexible pipeline that accommodates both sparse and dense inputs, rating prediction and click-through rate (CTR) prediction tasks, and an array of recommendation models. Lastly, AutoRec provides a simple, user-friendly API. Experiments conducted on the benchmark datasets reveal AutoRec is reliable and can identify models which resemble the best model without prior knowledge.

Viaarxiv icon

Superhighway: Bypass Data Sparsity in Cross-Domain CF

Aug 28, 2018
Kwei-Herng Lai, Ting-Hsiang Wang, Heng-Yu Chi, Yian Chen, Ming-Feng Tsai, Chuan-Ju Wang

Figure 1 for Superhighway: Bypass Data Sparsity in Cross-Domain CF
Figure 2 for Superhighway: Bypass Data Sparsity in Cross-Domain CF
Figure 3 for Superhighway: Bypass Data Sparsity in Cross-Domain CF

Cross-domain collaborative filtering (CF) aims to alleviate data sparsity in single-domain CF by leveraging knowledge transferred from related domains. Many traditional methods focus on enriching compared neighborhood relations in CF directly to address the sparsity problem. In this paper, we propose superhighway construction, an alternative explicit relation-enrichment procedure, to improve recommendations by enhancing cross-domain connectivity. Specifically, assuming partially overlapped items (users), superhighway bypasses multi-hop inter-domain paths between cross-domain users (items, respectively) with direct paths to enrich the cross-domain connectivity. The experiments conducted on a real-world cross-region music dataset and a cross-platform movie dataset show that the proposed superhighway construction significantly improves recommendation performance in both target and source domains.

Viaarxiv icon