Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dongha Kim

Personalizing Large Language Models using Retrieval Augmented Generation and Knowledge Graph

May 15, 2025

Deeksha Prahlad, Chanhee Lee, Dongha Kim, Hokeun Kim

Abstract:The advent of large language models (LLMs) has allowed numerous applications, including the generation of queried responses, to be leveraged in chatbots and other conversational assistants. Being trained on a plethora of data, LLMs often undergo high levels of over-fitting, resulting in the generation of extra and incorrect data, thus causing hallucinations in output generation. One of the root causes of such problems is the lack of timely, factual, and personalized information fed to the LLM. In this paper, we propose an approach to address these problems by introducing retrieval augmented generation (RAG) using knowledge graphs (KGs) to assist the LLM in personalized response generation tailored to the users. KGs have the advantage of storing continuously updated factual information in a structured way. While our KGs can be used for a variety of frequently updated personal data, such as calendar, contact, and location data, we focus on calendar data in this paper. Our experimental results show that our approach works significantly better in understanding personal information and generating accurate responses compared to the baseline LLMs using personal data as text inputs, with a moderate reduction in response time.

* To appear in the Companion Proceedings of the ACM Web Conference 2025 (WWW Companion '25)

Via

Access Paper or Ask Questions

Nonparametric estimation of a factorizable density using diffusion models

Jan 03, 2025

Hyeok Kyu Kwon, Dongha Kim, Ilsang Ohn, Minwoo Chae

Figure 1 for Nonparametric estimation of a factorizable density using diffusion models

Figure 2 for Nonparametric estimation of a factorizable density using diffusion models

Figure 3 for Nonparametric estimation of a factorizable density using diffusion models

Figure 4 for Nonparametric estimation of a factorizable density using diffusion models

Abstract:In recent years, diffusion models, and more generally score-based deep generative models, have achieved remarkable success in various applications, including image and audio generation. In this paper, we view diffusion models as an implicit approach to nonparametric density estimation and study them within a statistical framework to analyze their surprising performance. A key challenge in high-dimensional statistical inference is leveraging low-dimensional structures inherent in the data to mitigate the curse of dimensionality. We assume that the underlying density exhibits a low-dimensional structure by factorizing into low-dimensional components, a property common in examples such as Bayesian networks and Markov random fields. Under suitable assumptions, we demonstrate that an implicit density estimator constructed from diffusion models adapts to the factorization structure and achieves the minimax optimal rate with respect to the total variation distance. In constructing the estimator, we design a sparse weight-sharing neural network architecture, where sparsity and weight-sharing are key features of practical architectures such as convolutional neural networks and recurrent neural networks.

Via

Access Paper or Ask Questions

Beyond Ontology in Dialogue State Tracking for Goal-Oriented Chatbot

Oct 30, 2024

Sejin Lee, Dongha Kim, Min Song

Abstract:Goal-oriented chatbots are essential for automating user tasks, such as booking flights or making restaurant reservations. A key component of these systems is Dialogue State Tracking (DST), which interprets user intent and maintains the dialogue state. However, existing DST methods often rely on fixed ontologies and manually compiled slot values, limiting their adaptability to open-domain dialogues. We propose a novel approach that leverages instruction tuning and advanced prompt strategies to enhance DST performance, without relying on any predefined ontologies. Our method enables Large Language Model (LLM) to infer dialogue states through carefully designed prompts and includes an anti-hallucination mechanism to ensure accurate tracking in diverse conversation contexts. Additionally, we employ a Variational Graph Auto-Encoder (VGAE) to model and predict subsequent user intent. Our approach achieved state-of-the-art with a JGA of 42.57% outperforming existing ontology-less DST models, and performed well in open-domain real-world conversations. This work presents a significant advancement in creating more adaptive and accurate goal-oriented chatbots.

* There are 10 chapters, including references, and 2 figures used. To be presented at the 15th IEEE International Conference on Knowledge Graphs (ICKG2024)

Via

Access Paper or Ask Questions

Illustrious: an Open Advanced Illustration Model

Sep 30, 2024

Sang Hyun Park, Jun Young Koh, Junha Lee, Joy Song, Dongha Kim, Hoyeon Moon, Hyunju Lee, Min Song

Figure 1 for Illustrious: an Open Advanced Illustration Model

Figure 2 for Illustrious: an Open Advanced Illustration Model

Figure 3 for Illustrious: an Open Advanced Illustration Model

Figure 4 for Illustrious: an Open Advanced Illustration Model

Abstract:In this work, we share the insights for achieving state-of-the-art quality in our text-to-image anime image generative model, called Illustrious. To achieve high resolution, dynamic color range images, and high restoration ability, we focus on three critical approaches for model improvement. First, we delve into the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations. Second, we increase the training resolution of images, affecting the accurate depiction of character anatomy in much higher resolution, extending its generation capability over 20MP with proper methods. Finally, we propose the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development. Through extensive analysis and experiments, Illustrious demonstrates state-of-the-art performance in terms of animation style, outperforming widely-used models in illustration domains, propelling easier customization and personalization with nature of open source. We plan to publicly release updated Illustrious model series sequentially as well as sustainable plans for improvements.

Via

Access Paper or Ask Questions

ALTBI: Constructing Improved Outlier Detection Models via Optimization of Inlier-Memorization Effect

Aug 19, 2024

Seoyoung Cho, Jaesung Hwang, Kwan-Young Bak, Dongha Kim

Abstract:Outlier detection (OD) is the task of identifying unusual observations (or outliers) from a given or upcoming data by learning unique patterns of normal observations (or inliers). Recently, a study introduced a powerful unsupervised OD (UOD) solver based on a new observation of deep generative models, called inlier-memorization (IM) effect, which suggests that generative models memorize inliers before outliers in early learning stages. In this study, we aim to develop a theoretically principled method to address UOD tasks by maximally utilizing the IM effect. We begin by observing that the IM effect is observed more clearly when the given training data contain fewer outliers. This finding indicates a potential for enhancing the IM effect in UOD regimes if we can effectively exclude outliers from mini-batches when designing the loss function. To this end, we introduce two main techniques: 1) increasing the mini-batch size as the model training proceeds and 2) using an adaptive threshold to calculate the truncated loss function. We theoretically show that these two techniques effectively filter out outliers from the truncated loss function, allowing us to utilize the IM effect to the fullest. Coupled with an additional ensemble strategy, we propose our method and term it Adaptive Loss Truncation with Batch Increment (ALTBI). We provide extensive experimental results to demonstrate that ALTBI achieves state-of-the-art performance in identifying outliers compared to other recent methods, even with significantly lower computation costs. Additionally, we show that our method yields robust performances when combined with privacy-preserving algorithms.

* 24 pages in total

Via

Access Paper or Ask Questions

META-ANOVA: Screening interactions for interpretable machine learning

Aug 02, 2024

Yongchan Choi, Seokhun Park, Chanmoo Park, Dongha Kim, Yongdai Kim

Figure 1 for META-ANOVA: Screening interactions for interpretable machine learning

Figure 2 for META-ANOVA: Screening interactions for interpretable machine learning

Figure 3 for META-ANOVA: Screening interactions for interpretable machine learning

Figure 4 for META-ANOVA: Screening interactions for interpretable machine learning

Abstract:There are two things to be considered when we evaluate predictive models. One is prediction accuracy,and the other is interpretability. Over the recent decades, many prediction models of high performance, such as ensemble-based models and deep neural networks, have been developed. However, these models are often too complex, making it difficult to intuitively interpret their predictions. This complexity in interpretation limits their use in many real-world fields that require accountability, such as medicine, finance, and college admissions. In this study, we develop a novel method called Meta-ANOVA to provide an interpretable model for any given prediction model. The basic idea of Meta-ANOVA is to transform a given black-box prediction model to the functional ANOVA model. A novel technical contribution of Meta-ANOVA is a procedure of screening out unnecessary interaction before transforming a given black-box model to the functional ANOVA model. This screening procedure allows the inclusion of higher order interactions in the transformed functional ANOVA model without computational difficulties. We prove that the screening procedure is asymptotically consistent. Through various experiments with synthetic and real-world datasets, we empirically demonstrate the superiority of Meta-ANOVA

* 26 pages

Via

Access Paper or Ask Questions

Optimizing Quantum Convolutional Neural Network Architectures for Arbitrary Data Dimension

Mar 28, 2024

Changwon Lee, Israel F. Araujo, Dongha Kim, Junghan Lee, Siheon Park, Ju-Young Ryu, Daniel K. Park

Figure 1 for Optimizing Quantum Convolutional Neural Network Architectures for Arbitrary Data Dimension

Figure 2 for Optimizing Quantum Convolutional Neural Network Architectures for Arbitrary Data Dimension

Figure 3 for Optimizing Quantum Convolutional Neural Network Architectures for Arbitrary Data Dimension

Figure 4 for Optimizing Quantum Convolutional Neural Network Architectures for Arbitrary Data Dimension

Abstract:Quantum convolutional neural networks (QCNNs) represent a promising approach in quantum machine learning, paving new directions for both quantum and classical data analysis. This approach is particularly attractive due to the absence of the barren plateau problem, a fundamental challenge in training quantum neural networks (QNNs), and its feasibility. However, a limitation arises when applying QCNNs to classical data. The network architecture is most natural when the number of input qubits is a power of two, as this number is reduced by a factor of two in each pooling layer. The number of input qubits determines the dimensions (i.e. the number of features) of the input data that can be processed, restricting the applicability of QCNN algorithms to real-world data. To address this issue, we propose a QCNN architecture capable of handling arbitrary input data dimensions while optimizing the allocation of quantum resources such as ancillary qubits and quantum gates. This optimization is not only important for minimizing computational resources, but also essential in noisy intermediate-scale quantum (NISQ) computing, as the size of the quantum circuits that can be executed reliably is limited. Through numerical simulations, we benchmarked the classification performance of various QCNN architectures when handling arbitrary input data dimensions on the MNIST and Breast Cancer datasets. The results validate that the proposed QCNN architecture achieves excellent classification performance while utilizing a minimal resource overhead, providing an optimal solution when reliable quantum computation is constrained by noise and imperfections.

* 17 pages, 7 figures

Via

Access Paper or Ask Questions

ODIM: an efficient method to detect outliers via inlier-memorization effect of deep generative models

Jan 11, 2023

Dongha Kim, Jaesung Hwang, Jongjin Lee, Kunwoong Kim, Yongdai Kim

Figure 1 for ODIM: an efficient method to detect outliers via inlier-memorization effect of deep generative models

Figure 2 for ODIM: an efficient method to detect outliers via inlier-memorization effect of deep generative models

Figure 3 for ODIM: an efficient method to detect outliers via inlier-memorization effect of deep generative models

Figure 4 for ODIM: an efficient method to detect outliers via inlier-memorization effect of deep generative models

Abstract:Identifying whether a given sample is an outlier or not is an important issue in various real-world domains. This study aims to solve the unsupervised outlier detection problem where training data contain outliers, but any label information about inliers and outliers is not given. We propose a powerful and efficient learning framework to identify outliers in a training data set using deep neural networks. We start with a new observation called the inlier-memorization (IM) effect. When we train a deep generative model with data contaminated with outliers, the model first memorizes inliers before outliers. Exploiting this finding, we develop a new method called the outlier detection via the IM effect (ODIM). The ODIM only requires a few updates; thus, it is computationally efficient, tens of times faster than other deep-learning-based algorithms. Also, the ODIM filters out outliers successfully, regardless of the types of data, such as tabular, image, and sequential. We empirically demonstrate the superiority and efficiency of the ODIM by analyzing 20 data sets.

* 23 pages in total

Via

Access Paper or Ask Questions

Learning fair representation with a parametric integral probability metric

Feb 17, 2022

Dongha Kim, Kunwoong Kim, Insung Kong, Ilsang Ohn, Yongdai Kim

Figure 1 for Learning fair representation with a parametric integral probability metric

Figure 2 for Learning fair representation with a parametric integral probability metric

Figure 3 for Learning fair representation with a parametric integral probability metric

Figure 4 for Learning fair representation with a parametric integral probability metric

Abstract:As they have a vital effect on social decision-making, AI algorithms should be not only accurate but also fair. Among various algorithms for fairness AI, learning fair representation (LFR), whose goal is to find a fair representation with respect to sensitive variables such as gender and race, has received much attention. For LFR, the adversarial training scheme is popularly employed as is done in the generative adversarial network type algorithms. The choice of a discriminator, however, is done heuristically without justification. In this paper, we propose a new adversarial training scheme for LFR, where the integral probability metric (IPM) with a specific parametric family of discriminators is used. The most notable result of the proposed LFR algorithm is its theoretical guarantee about the fairness of the final prediction model, which has not been considered yet. That is, we derive theoretical relations between the fairness of representation and the fairness of the prediction model built on the top of the representation (i.e., using the representation as the input). Moreover, by numerical experiments, we show that our proposed LFR algorithm is computationally lighter and more stable, and the final prediction model is competitive or superior to other LFR algorithms using more complex discriminators.

* 24 pages, including references and appendix

Via

Access Paper or Ask Questions

INN: A Method Identifying Clean-annotated Samples via Consistency Effect in Deep Neural Networks

Jun 29, 2021

Dongha Kim, Yongchan Choi, Kunwoong Kim, Yongdai Kim

Figure 1 for INN: A Method Identifying Clean-annotated Samples via Consistency Effect in Deep Neural Networks

Figure 2 for INN: A Method Identifying Clean-annotated Samples via Consistency Effect in Deep Neural Networks

Figure 3 for INN: A Method Identifying Clean-annotated Samples via Consistency Effect in Deep Neural Networks

Figure 4 for INN: A Method Identifying Clean-annotated Samples via Consistency Effect in Deep Neural Networks

Abstract:In many classification problems, collecting massive clean-annotated data is not easy, and thus a lot of researches have been done to handle data with noisy labels. Most recent state-of-art solutions for noisy label problems are built on the small-loss strategy which exploits the memorization effect. While it is a powerful tool, the memorization effect has several drawbacks. The performances are sensitive to the choice of a training epoch required for utilizing the memorization effect. In addition, when the labels are heavily contaminated or imbalanced, the memorization effect may not occur in which case the methods based on the small-loss strategy fail to identify clean labeled data. We introduce a new method called INN(Integration with the Nearest Neighborhoods) to refine clean labeled data from training data with noisy labels. The proposed method is based on a new discovery that a prediction pattern at neighbor regions of clean labeled data is consistently different from that of noisy labeled data regardless of training epochs. The INN method requires more computation but is much stable and powerful than the small-loss strategy. By carrying out various experiments, we demonstrate that the INN method resolves the shortcomings in the memorization effect successfully and thus is helpful to construct more accurate deep prediction models with training data with noisy labels.

* 17 pages, 9 figures

Via

Access Paper or Ask Questions