Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sangseok Yun

GLYPH-SR: Can We Achieve Both High-Quality Image Super-Resolution and High-Fidelity Text Recovery via VLM-guided Latent Diffusion Model?

Oct 30, 2025

Mingyu Sung, Seungjae Ham, Kangwoo Kim, Yeokyoung Yoon, Sangseok Yun, Il-Min Kim, Jae-Mo Kang

Abstract:Image super-resolution(SR) is fundamental to many vision system-from surveillance and autonomy to document analysis and retail analytics-because recovering high-frequency details, especially scene-text, enables reliable downstream perception. Scene-text, i.e., text embedded in natural images such as signs, product labels, and storefronts, often carries the most actionable information; when characters are blurred or hallucinated, optical character recognition(OCR) and subsequent decisions fail even if the rest of the image appears sharp. Yet previous SR research has often been tuned to distortion (PSNR/SSIM) or learned perceptual metrics (LIPIS, MANIQA, CLIP-IQA, MUSIQ) that are largely insensitive to character-level errors. Furthermore, studies that do address text SR often focus on simplified benchmarks with isolated characters, overlooking the challenges of text within complex natural scenes. As a result, scene-text is effectively treated as generic texture. For SR to be effective in practical deployments, it is therefore essential to explicitly optimize for both text legibility and perceptual quality. We present GLYPH-SR, a vision-language-guided diffusion framework that aims to achieve both objectives jointly. GLYPH-SR utilizes a Text-SR Fusion ControlNet(TS-ControlNet) guided by OCR data, and a ping-pong scheduler that alternates between text- and scene-centric guidance. To enable targeted text restoration, we train these components on a synthetic corpus while keeping the main SR branch frozen. Across SVT, SCUT-CTW1500, and CUTE80 at x4, and x8, GLYPH-SR improves OCR F1 by up to +15.18 percentage points over diffusion/GAN baseline (SVT x8, OpenOCR) while maintaining competitive MANIQA, CLIP-IQA, and MUSIQ. GLYPH-SR is designed to satisfy both objectives simultaneously-high readability and high visual realism-delivering SR that looks right and reds right.

* 11 pages, 6 figures. Includes supplementary material. Under review as a conference paper at ICLR 2026

Via

Access Paper or Ask Questions

Secure Power Control for Downlink Cell-Free Massive MIMO With Passive Eavesdroppers

Nov 25, 2022

Junguk Park, Sangseok Yun, Jeongseok Ha

Figure 1 for Secure Power Control for Downlink Cell-Free Massive MIMO With Passive Eavesdroppers

Figure 2 for Secure Power Control for Downlink Cell-Free Massive MIMO With Passive Eavesdroppers

Abstract:This work studies secure communications for a cell-free massive multiple-input multiple-output (CF-mMIMO) network which is attacked by multiple passive eavesdroppers overhearing communications between access points (APs) and users in the network. It will be revealed that the distributed APs in CF-mMIMO allows not only legitimate users but also eavesdroppers to reap the diversity gain, which seriously degrades secrecy performance. Motivated by this, this work proposes an artificial noise (AN)-aided secure power control scheme for CF-mMIMO under passive eavesdropping aiming to achieve a higher secrecy rate and/or guarantee security. In particular, it will be demonstrated that a careful use of AN signal in the power control is especially important to improve the secrecy performance. The performance of the proposed power control scheme is evaluated and compared with various power control schemes via numerical experiments, which clearly shows that the proposed power control scheme outperforms all the competing schemes.

* 5 pages, 3 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Fast Federated Learning by Balancing Communication Trade-Offs

May 23, 2021

Milad Khademi Nori, Sangseok Yun, Il-Min Kim

Figure 1 for Fast Federated Learning by Balancing Communication Trade-Offs

Figure 2 for Fast Federated Learning by Balancing Communication Trade-Offs

Figure 3 for Fast Federated Learning by Balancing Communication Trade-Offs

Figure 4 for Fast Federated Learning by Balancing Communication Trade-Offs

Abstract:Federated Learning (FL) has recently received a lot of attention for large-scale privacy-preserving machine learning. However, high communication overheads due to frequent gradient transmissions decelerate FL. To mitigate the communication overheads, two main techniques have been studied: (i) local update of weights characterizing the trade-off between communication and computation and (ii) gradient compression characterizing the trade-off between communication and precision. To the best of our knowledge, studying and balancing those two trade-offs jointly and dynamically while considering their impacts on convergence has remained unresolved even though it promises significantly faster FL. In this paper, we first formulate our problem to minimize learning error with respect to two variables: local update coefficients and sparsity budgets of gradient compression who characterize trade-offs between communication and computation/precision, respectively. We then derive an upper bound of the learning error in a given wall-clock time considering the interdependency between the two variables. Based on this theoretical analysis, we propose an enhanced FL scheme, namely Fast FL (FFL), that jointly and dynamically adjusts the two variables to minimize the learning error. We demonstrate that FFL consistently achieves higher accuracies faster than similar schemes existing in the literature.

* 14 pages, 24 figures, accepted for publication in IEEE Transactions on Communications

Via

Access Paper or Ask Questions