Picture for Ji-Hoon Kim

Ji-Hoon Kim

MamTra: A Hybrid Mamba-Transformer Backbone for Speech Synthesis

Add code
Mar 12, 2026
Viaarxiv icon

Learning Where It Matters: Geometric Anchoring for Robust Preference Alignment

Add code
Feb 04, 2026
Viaarxiv icon

UNMIXX: Untangling Highly Correlated Singing Voices Mixtures

Add code
Jan 19, 2026
Viaarxiv icon

TAVID: Text-Driven Audio-Visual Interactive Dialogue Generation

Add code
Dec 23, 2025
Viaarxiv icon

Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment

Add code
May 26, 2025
Viaarxiv icon

AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation

Add code
Apr 29, 2025
Viaarxiv icon

SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models

Add code
Apr 01, 2025
Figure 1 for SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models
Figure 2 for SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models
Figure 3 for SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models
Figure 4 for SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models
Viaarxiv icon

EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models

Add code
Jan 10, 2025
Figure 1 for EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
Figure 2 for EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
Figure 3 for EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
Figure 4 for EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
Viaarxiv icon

AdaptVC: High Quality Voice Conversion with Adaptive Learning

Add code
Jan 07, 2025
Figure 1 for AdaptVC: High Quality Voice Conversion with Adaptive Learning
Figure 2 for AdaptVC: High Quality Voice Conversion with Adaptive Learning
Figure 3 for AdaptVC: High Quality Voice Conversion with Adaptive Learning
Figure 4 for AdaptVC: High Quality Voice Conversion with Adaptive Learning
Viaarxiv icon

CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation

Add code
Dec 28, 2024
Figure 1 for CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
Figure 2 for CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
Figure 3 for CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
Figure 4 for CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
Viaarxiv icon