Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Youngjun Choi

MIDUS: Memory-Infused Depth Up-Scaling

Dec 15, 2025

Taero Kim, Hoyoon Byun, Youngjun Choi, Sungrae Park, Kyungwoo Song

Figure 1 for MIDUS: Memory-Infused Depth Up-Scaling

Figure 2 for MIDUS: Memory-Infused Depth Up-Scaling

Figure 3 for MIDUS: Memory-Infused Depth Up-Scaling

Figure 4 for MIDUS: Memory-Infused Depth Up-Scaling

Abstract:Scaling large language models (LLMs) demands approaches that increase capacity without incurring excessive parameter growth or inference cost. Depth Up-Scaling (DUS) has emerged as a promising strategy by duplicating layers and applying Continual Pre-training (CPT), but its reliance on feed-forward networks (FFNs) limits efficiency and attainable gains. We introduce Memory-Infused Depth Up-Scaling (MIDUS), which replaces FFNs in duplicated blocks with a head-wise memory (HML) layer. Motivated by observations that attention heads have distinct roles both across and within layers, MIDUS assigns an independent memory bank to each head, enabling head-wise retrieval and injecting information into subsequent layers while preserving head-wise functional structure. This design combines sparse memory access with head-wise representations and incorporates an efficient per-head value factorization module, thereby relaxing the usual efficiency-performance trade-off. Across our CPT experiments, MIDUS exhibits robust performance improvements over strong DUS baselines while maintaining a highly efficient parameter footprint. Our findings establish MIDUS as a compelling and resource-efficient alternative to conventional FFN replication for depth up-scaling by leveraging its head-wise memory design.

Via

Access Paper or Ask Questions

Uncertainty-driven Embedding Convolution

Jul 28, 2025

Sungjun Lim, Kangjun Noh, Youngjun Choi, Heeyoung Lee, Kyungwoo Song

Abstract:Text embeddings are essential components in modern NLP pipelines. While numerous embedding models have been proposed, their performance varies across domains, and no single model consistently excels across all tasks. This variability motivates the use of ensemble techniques to combine complementary strengths. However, most existing ensemble methods operate on deterministic embeddings and fail to account for model-specific uncertainty, limiting their robustness and reliability in downstream applications. To address these limitations, we propose Uncertainty-driven Embedding Convolution (UEC). UEC first transforms deterministic embeddings into probabilistic ones in a post-hoc manner. It then computes adaptive ensemble weights based on embedding uncertainty, grounded in a Bayes-optimal solution under a surrogate loss. Additionally, UEC introduces an uncertainty-aware similarity function that directly incorporates uncertainty into similarity scoring. Extensive experiments on retrieval, classification, and semantic similarity benchmarks demonstrate that UEC consistently improves both performance and robustness by leveraging principled uncertainty modeling.

Via

Access Paper or Ask Questions

LBC: Language-Based-Classifier for Out-Of-Variable Generalization

Aug 21, 2024

Kangjun Noh, Baekryun Seong, Hoyoon Byun, Youngjun Choi, Sungjin Song, Kyungwoo Song

Figure 1 for LBC: Language-Based-Classifier for Out-Of-Variable Generalization

Figure 2 for LBC: Language-Based-Classifier for Out-Of-Variable Generalization

Figure 3 for LBC: Language-Based-Classifier for Out-Of-Variable Generalization

Figure 4 for LBC: Language-Based-Classifier for Out-Of-Variable Generalization

Abstract:Large Language Models (LLMs) have great success in natural language processing tasks such as response generation. However, their use in tabular data has been limited due to their inferior performance compared to traditional machine learning models (TMLs) such as XGBoost. We find that the pre-trained knowledge of LLMs enables them to interpret new variables that appear in a test without additional training, a capability central to the concept of Out-of-Variable (OOV). From the findings, we propose a Language-Based-Classifier (LBC), a classifier that maximizes the benefits of LLMs to outperform TMLs on OOV tasks. LBC employs three key methodological strategies: 1) Categorical changes to adjust data to better fit the model's understanding, 2) Advanced order and indicator to enhance data representation to the model, and 3) Using verbalizer to map logit scores to classes during inference to generate model predictions. These strategies, combined with the pre-trained knowledge of LBC, emphasize the model's ability to effectively handle OOV tasks. We empirically and theoretically validate the superiority of LBC. LBC is the first study to apply an LLM-based model to OOV tasks. The source code is at https://github.com/sksmssh/LBCforOOVGen

* 16 pages, 7 figures, 4 tables

Via

Access Paper or Ask Questions

Audio Mosaicing with Simulation-based Inference

Oct 26, 2022

Andrew Gambardella, Youngjun Choi, Doyo Choi, Jinjoon Lee

Figure 1 for Audio Mosaicing with Simulation-based Inference

Figure 2 for Audio Mosaicing with Simulation-based Inference

Figure 3 for Audio Mosaicing with Simulation-based Inference

Figure 4 for Audio Mosaicing with Simulation-based Inference

Abstract:Mosaics and collages have been an integral part of art for decades. Particularly important in contemporary media art is the audio mosaic, in which an artist manually combines several audio sources in order to construct one single coherent sound, combining elements from disparate sources. Here we propose an algorithm to automatically create audio mosaics using the simulation-based inference paradigm. Our algorithm takes as input an audio file that one wishes to approximate, and a list of audio files one can use for approximation, finding a posterior distribution from which one can sample reconstructions of the original audio file, using the sources in an interpretable and disentangled manner. We validate our approach by creating an audio mosaic which reconstructs the sound of a traditional Korean funeral using 100 K-pop songs rearranged and overlapped.

Via

Access Paper or Ask Questions