Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yun Jang

d-DQIVAR: Data-centric Visual Analytics and Reasoning for Data Quality Improvement

Jul 16, 2025

Hyein Hong, Sangbong Yoo, SeokHwan Choi, Jisue Kim, Seongbum Seo, Haneol Cho, Chansoo Kim, Yun Jang

Abstract:Approaches to enhancing data quality (DQ) are classified into two main categories: data- and process-driven. However, prior research has predominantly utilized batch data preprocessing within the data-driven framework, which often proves insufficient for optimizing machine learning (ML) model performance and frequently leads to distortions in data characteristics. Existing studies have primarily focused on data preprocessing rather than genuine data quality improvement (DQI). In this paper, we introduce d-DQIVAR, a novel visual analytics system designed to facilitate DQI strategies aimed at improving ML model performance. Our system integrates visual analytics techniques that leverage both data-driven and process-driven approaches. Data-driven techniques tackle DQ issues such as imputation, outlier detection, deletion, format standardization, removal of duplicate records, and feature selection. Process-driven strategies encompass evaluating DQ and DQI procedures by considering DQ dimensions and ML model performance and applying the Kolmogorov-Smirnov test. We illustrate how our system empowers users to harness expert and domain knowledge effectively within a practical workflow through case studies, evaluations, and user studies.

Via

Access Paper or Ask Questions

Time Series Imputation with Multivariate Radial Basis Function Neural Network

Jul 31, 2024

Chanyoung Jung, Yun Jang

Figure 1 for Time Series Imputation with Multivariate Radial Basis Function Neural Network

Figure 2 for Time Series Imputation with Multivariate Radial Basis Function Neural Network

Figure 3 for Time Series Imputation with Multivariate Radial Basis Function Neural Network

Figure 4 for Time Series Imputation with Multivariate Radial Basis Function Neural Network

Abstract:Researchers have been persistently working to address the issue of missing values in time series data. Numerous models have been proposed, striving to estimate the distribution of the data. The Radial Basis Functions Neural Network (RBFNN) has recently exhibited exceptional performance in estimating data distribution. In this paper, we propose a time series imputation model based on RBFNN. Our imputation model learns local information from timestamps to create a continuous function. Additionally, we incorporate time gaps to facilitate learning information considering the missing terms of missing values. We name this model the Missing Imputation Multivariate RBFNN (MIM-RBFNN). However, MIM-RBFNN relies on a local information-based learning approach, which presents difficulties in utilizing temporal information. Therefore, we propose an extension called the Missing Value Imputation Recurrent Neural Network with Continuous Function (MIRNN-CF) using the continuous function generated by MIM-RBFNN. We evaluate the performance using two real-world datasets with non-random missing and random missing patterns, and conduct an ablation study comparing MIM-RBFNN and MIRNN-CF.

Via

Access Paper or Ask Questions

Time Series Missing Imputation with Multivariate Radial Basis Function Neural Network

Jul 24, 2024

Chanyoung Jung, Yun Jang

Figure 1 for Time Series Missing Imputation with Multivariate Radial Basis Function Neural Network

Figure 2 for Time Series Missing Imputation with Multivariate Radial Basis Function Neural Network

Figure 3 for Time Series Missing Imputation with Multivariate Radial Basis Function Neural Network

Figure 4 for Time Series Missing Imputation with Multivariate Radial Basis Function Neural Network

Via

Access Paper or Ask Questions