Picture for Junwon Lee

Junwon Lee

RLDX-1 Technical Report

Add code
May 05, 2026
Viaarxiv icon

AgentLens: Adaptive Visual Modalities for Human-Agent Interaction in Mobile GUI Agents

Add code
Apr 22, 2026
Viaarxiv icon

UNMIXX: Untangling Highly Correlated Singing Voices Mixtures

Add code
Jan 19, 2026
Viaarxiv icon

KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation

Add code
Feb 21, 2025
Viaarxiv icon

Sound Scene Synthesis at the DCASE 2024 Challenge

Add code
Jan 15, 2025
Viaarxiv icon

Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation

Add code
Oct 23, 2024
Figure 1 for Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation
Figure 2 for Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation
Figure 3 for Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation
Figure 4 for Challenge on Sound Scene Synthesis: Evaluating Text-to-Audio Generation
Viaarxiv icon

Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound

Add code
Aug 21, 2024
Figure 1 for Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound
Figure 2 for Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound
Figure 3 for Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound
Figure 4 for Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound
Viaarxiv icon

CONMOD: Controllable Neural Frame-based Modulation Effects

Add code
Jun 20, 2024
Figure 1 for CONMOD: Controllable Neural Frame-based Modulation Effects
Figure 2 for CONMOD: Controllable Neural Frame-based Modulation Effects
Figure 3 for CONMOD: Controllable Neural Frame-based Modulation Effects
Figure 4 for CONMOD: Controllable Neural Frame-based Modulation Effects
Viaarxiv icon

Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant

Add code
Mar 26, 2024
Figure 1 for Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Figure 2 for Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Figure 3 for Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Figure 4 for Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Viaarxiv icon

T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis

Add code
Jan 17, 2024
Viaarxiv icon