Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sanghyun Park

Yonsei University, Seoul, Republic of Korea

GeneralThinker: Domain-General Reasoning through Likelihood-Guided Answer-Conditioned Optimization

May 27, 2026

Shengmin Piao, Sanghyun Park

Abstract:Reinforcement learning with verifiable rewards improves language model reasoning, but its reliance on domain-specific verifiers, sparse outcome rewards, and coarse-grained credit assignment limits its applicability. We introduce GeneralThinker, an on-policy framework that reformulates reasoning supervision as dense answer-conditioned optimization, enabling response-level evaluation and token-level credit assignment without domain-specific verifiers. GeneralThinker evaluates generated reasoning trajectories using the likelihood of the ground-truth answer and derives token-wise compatibility signals for fine-grained credit assignment. To stabilize optimization, it constrains token-level updates through clipping and direction-preserving modulation. Across 11 benchmarks spanning mathematics, STEM, and general reasoning, GeneralThinker achieves the best average performance. Further analyses show that uncontrolled token-level modulation can destabilize training, whereas controlled modulation makes fine-grained credit assignment consistently effective.

Via

Access Paper or Ask Questions

ALIVE-LIO: Degeneracy-Aware Learning of Inertial Velocity for Enhancing ESKF-Based LiDAR-Inertial Odometry

Apr 03, 2026

Seongjun Kim, Daehan Lee, Junwoo Hong, Sanghyun Park, Hyunyoung Jo, Soohee Han

Abstract:Odometry estimation using light detection and ranging (LiDAR) and an inertial measurement unit (IMU), known as LiDAR-inertial odometry (LIO), often suffers from performance degradation in degenerate environments, such as long corridors or single-wall scenarios with narrow field-of-view LiDAR. To address this limitation, we propose ALIVE-LIO, a degeneracy-aware LiDAR-inertial odometry framework that explicitly enhances state estimation in degenerate directions. The key contribution of ALIVE-LIO is the strategic integration of a deep neural network into a classical error-state Kalman filter (ESKF) to compensate for the loss of LiDAR observability. Specifically, ALIVE-LIO employs a neural network to predict the body-frame velocity and selectively fuses this prediction into the ESKF only when degeneracy is detected, providing effective state updates along degenerate directions. This design enables ALIVE-LIO to utilize the probabilistic structure and consistency of the ESKF while benefiting from learning-based motion estimation. The proposed method was evaluated on publicly available datasets exhibiting degeneracy, as well as on our own collected data. Experimental results demonstrate that ALIVE-LIO substantially reduces pose drift in degenerate environments, yielding the most competitive results in 22 out of 32 sequences. The implementation of ALIVE-LIO will be publicly available.

* 18 pages, 9 figures

Via

Access Paper or Ask Questions

Let Triggers Control: Frequency-Aware Dropout for Effective Token Control

Mar 28, 2026

Junyoung Koh, Hoyeon Moon, Dongha Kim, Seungmin Lee, Sanghyun Park, Min Song

Abstract:Text-to-image models such as Stable Diffusion have achieved unprecedented levels of high-fidelity visual synthesis. As these models advance, personalization of generative models -- commonly facilitated through Low-Rank Adaptation (LoRA) with a dedicated trigger token -- has become a significant area of research. Previous works have naively assumed that fine-tuning with a single trigger token to represent new concepts. However, this often results in poor controllability, where the trigger token alone fails to reliably evoke the intended concept. We attribute this issue to the frequent co-occurrence of the trigger token with the surrounding context during fine-tuning, which entangles their representations and compromises the token's semantic distinctiveness. To disentangle this, we propose Frequency-Aware Dropout (FAD) -- a novel regularization technique that improves prompt controllability without adding new parameters. FAD consists of two key components: co-occurrence analysis and curriculum-inspired scheduling. Qualitative and quantitative analyses across token-based diffusion models (SD~1.5 and SDXL) and natural language--driven backbones (FLUX and Qwen-Image) demonstrate consistent gains in prompt fidelity, stylistic precision, and user-perceived quality. Our method provides a simple yet effective dropout strategy that enhances controllability and personalization in text-to-image generation. Notably, it achieves these improvements without introducing additional parameters or architectural modifications, making it readily applicable to existing models with minimal computational overhead.

* CVPR 2026 P13N: Personalization in Generative AI workshop

Via

Access Paper or Ask Questions

ROFT-VINS: Robust Feature Tracking-based Visual-Inertial State Estimation for Harsh Environment

Mar 19, 2026

Sanghyun Park, Soohee Han

Abstract:SLAM (Simultaneous Localization and Mapping) and Odometry are important systems for estimating the position of mobile devices, such as robots and cars, utilizing one or more sensors. Particularly in camera-based SLAM or Odometry, effectively tracking visual features is important as it significantly impacts system performance. In this paper, we propose a method that leverages deep learning to robustly track visual features in monocular camera images. This method operates reliably even in textureless environments and situations with rapid lighting changes. Additionally, we evaluate the performance of our proposed method by integrating it into VINS-Fusion (Monocular-Inertial), a commonly used Visual-Inertial Odometry (VIO) system.

* S. Park and S. Han, "ROFT-VINS: Robust Feature Tracking-based Visual-Inertial State Estimation for Harsh Environment," 2024 24th International Conference on Control, Automation and Systems (ICCAS) 2024, pp. 508-513
* 6 pages, published ICCAS 2024

Via

Access Paper or Ask Questions

Benchmarking Visual Feature Representations for LiDAR-Inertial-Visual Odometry Under Challenging Conditions

Mar 19, 2026

Eunseon Choi, Junwoo Hong, Daehan Lee, Sanghyun Park, Hyunyoung Jo, Sunyoung Kim, Changho Kang, Seongsam Kim, Yonghan Jung, Jungwook Park(+2 more)

Abstract:Accurate localization in autonomous driving is critical for successful missions including environmental mapping and survivor searches. In visually challenging environments, including low-light conditions, overexposure, illumination changes, and high parallax, the performance of conventional visual odometry methods significantly degrade undermining robust robotic navigation. Researchers have recently proposed LiDAR-inertial-visual odometry (LIVO) frameworks, that integrate LiDAR, IMU, and camera sensors, to address these challenges. This paper extends the FAST-LIVO2-based framework by introducing a hybrid approach that integrates direct photometric methods with descriptor-based feature matching. For the descriptor-based feature matching, this work proposes pairs of ORB with the Hamming distance, SuperPoint with SuperGlue, SuperPoint with LightGlue, and XFeat with the mutual nearest neighbor. The proposed configurations are benchmarked by accuracy, computational cost, and feature tracking stability, enabling a quantitative comparison of the adaptability and applicability of visual descriptors. The experimental results reveal that the proposed hybrid approach outperforms the conventional sparse-direct method. Although the sparse-direct method often fails to converge in regions where photometric inconsistency arises due to illumination changes, the proposed approach still maintains robust performance under the same conditions. Furthermore, the hybrid approach with learning-based descriptors enables robust and reliable visual state estimation across challenging environments.

* E. Choi et al., "Benchmarking Visual Feature Representations for LiDAR-Inertial-Visual Odometry Under Challenging Conditions," in IEEE Access, vol. 14, pp. 30186-30199, 2026
* 14 pages, Publised IEEE Access2026

Via

Access Paper or Ask Questions

GenZ-LIO: Generalizable LiDAR-Inertial Odometry Beyond Indoor--Outdoor Boundaries

Mar 17, 2026

Daehan Lee, Hyungtae Lim, Seongjun Kim, Soonbin Rho, Changhyeon Lee, Sanghyun Park, Junwoo Hong, Eunseon Choi, Hyunyoung Jo, Soohee Han

Abstract:Light detection and ranging (LiDAR)-inertial odometry (LIO) enables accurate localization and mapping for autonomous navigation in various scenes. However, its performance remains sensitive to variations in spatial scale, which refers to the spatial extent of the scene reflected in the distribution of point ranges in a LiDAR scan. Transitions between confined indoor and expansive outdoor spaces induce substantial variations in point density, which may reduce robustness and computational efficiency. To address this issue, we propose GenZ-LIO, a LIO framework generalizable across both indoor and outdoor environments. GenZ-LIO comprises three key components. First, inspired by the principle of the proportional-integral-derivative (PID) controller, it adaptively regulates the voxel size for downsampling via feedback control, driving the voxelized point count toward a scale-informed setpoint while enabling stable and efficient processing across varying scene scales. Second, we formulate a hybrid-metric state update that jointly leverages point-to-plane and point-to-point residuals to mitigate LiDAR degeneracy arising from directionally insufficient geometric constraints. Third, to alleviate the computational burden introduced by point-to-point matching, we introduce a voxel-pruned correspondence search strategy that discards non-promising voxel candidates and reduces unnecessary computations. Experimental results demonstrate that GenZ-LIO achieves robust odometry estimation and improved computational efficiency across confined indoor, open outdoor, and transitional environments. Our code will be made publicly available upon publication.

* 19 pages, 11 figures

Via

Access Paper or Ask Questions

Motion-Specific Battery Health Assessment for Quadrotors Using High-Fidelity Battery Models

Mar 13, 2026

Joonhee Kim, Sanghyun Park, Donghyeong Kim, Eunseon Choi, Soohee Han

Abstract:Quadrotor endurance is ultimately limited by battery behavior, yet most energy aware planning treats the battery as a simple energy reservoir and overlooks how flight motions induce dynamic current loads that accelerate battery degradation. This work presents an end to end framework for motion aware battery health assessment in quadrotors. We first design a wide range current sensing module to capture motion specific current profiles during real flights, preserving transient features. In parallel, a high fidelity battery model is calibrated using reference performance tests and a metaheuristic based on a degradation coupled electrochemical model.By simulating measured flight loads in the calibrated model, we systematically resolve how different flight motions translate into degradation modes loss of lithium inventory and loss of active material as well as internal side reactions. The results demonstrate that even when two flight profiles consume the same average energy, their transient load structures can drive different degradation pathways, emphasizing the need for motion-aware battery management that balances efficiency with battery degradation.

* 8 pages. Accepted to IEEE International Conference on Robotics and Automation (ICRA) 2026

Via

Access Paper or Ask Questions

SpiralThinker: Latent Reasoning through an Iterative Process with Text-Latent Interleaving

Nov 12, 2025

Shengmin Piao, Sanghyun Park

Figure 1 for SpiralThinker: Latent Reasoning through an Iterative Process with Text-Latent Interleaving

Figure 2 for SpiralThinker: Latent Reasoning through an Iterative Process with Text-Latent Interleaving

Figure 3 for SpiralThinker: Latent Reasoning through an Iterative Process with Text-Latent Interleaving

Figure 4 for SpiralThinker: Latent Reasoning through an Iterative Process with Text-Latent Interleaving

Abstract:Recent advances in large reasoning models have been driven by reinforcement learning and test-time scaling, accompanied by growing interest in latent rather than purely textual reasoning. However, existing latent reasoning methods lack mechanisms to ensure stable evolution of latent representations and a systematic way to interleave implicit and explicit reasoning. We introduce SpiralThinker, a unified framework that performs iterative updates over latent representations, enabling extended implicit reasoning without generating additional tokens. A progressive alignment objective combined with structured annotations maintains coherence between latent and textual reasoning. Across mathematical, logical, and commonsense reasoning tasks, SpiralThinker achieves the best overall performance among latent reasoning approaches, consistently surpassing previous methods across all benchmarks. Detailed analyses reveal that both iteration and alignment are indispensable, the numbers of latent tokens and iterations exhibit dataset-specific optima, and appropriate alignment proves critical for an effective iterative process. Overall, SpiralThinker bridges iterative computation and latent reasoning, demonstrating that aligned iterative updates can reliably steer reasoning in the latent space.

Via

Access Paper or Ask Questions

Relation-Aware Bayesian Optimization of DBMS Configurations Guided by Affinity Scores

Oct 31, 2025

Sein Kwon, Seulgi Baek, Hyunseo Yang, Youngwan Jo, Sanghyun Park

Abstract:Database Management Systems (DBMSs) are fundamental for managing large-scale and heterogeneous data, and their performance is critically influenced by configuration parameters. Effective tuning of these parameters is essential for adapting to diverse workloads and maximizing throughput while minimizing latency. Recent research has focused on automated configuration optimization using machine learning; however, existing approaches still exhibit several key limitations. Most tuning frameworks disregard the dependencies among parameters, assuming that each operates independently. This simplification prevents optimizers from leveraging relational effects across parameters, limiting their capacity to capture performancesensitive interactions. Moreover, to reduce the complexity of the high-dimensional search space, prior work often selects only the top few parameters for optimization, overlooking others that contribute meaningfully to performance. Bayesian Optimization (BO), the most common method for automatic tuning, is also constrained by its reliance on surrogate models, which can lead to unstable predictions and inefficient exploration. To overcome these limitations, we propose RelTune, a novel framework that represents parameter dependencies as a Relational Graph and learns GNN-based latent embeddings that encode performancerelevant semantics. RelTune further introduces Hybrid-Score-Guided Bayesian Optimization (HBO), which combines surrogate predictions with an Affinity Score measuring proximity to previously high-performing configurations. Experimental results on multiple DBMSs and workloads demonstrate that RelTune achieves faster convergence and higher optimization efficiency than conventional BO-based methods, achieving state-of-the-art performance across all evaluated scenarios.

* 13 pages

Via

Access Paper or Ask Questions

Real-Time Sleepiness Detection for Driver State Monitoring System

Apr 21, 2025

Deepak Ghimire, Sunghwan Jeong, Sunhong Yoon, Sanghyun Park, Juhwan Choi

Abstract:A driver face monitoring system can detect driver fatigue, which is a significant factor in many accidents, using computer vision techniques. In this paper, we present a real-time technique for driver eye state detection. First, the face is detected, and the eyes are located within the face region for tracking. A normalized cross-correlation-based online dynamic template matching technique, combined with Kalman filter tracking, is proposed to track the detected eye positions in subsequent image frames. A support vector machine with histogram of oriented gradients (HOG) features is used to classify the state of the eyes as open or closed. If the eyes remain closed for a specified period, the driver is considered to be asleep, and an alarm is triggered.

* Advanced Science and Technology Letters, 120, 1-8 (2015)
* 8 pages, published in GST 2015

Via

Access Paper or Ask Questions