Picture for Tae-Hyun Oh

Tae-Hyun Oh

POSTECH

DarkEQA: Benchmarking Vision-Language Models for Embodied Question Answering in Low-Light Indoor Environments

Add code
Dec 31, 2025
Viaarxiv icon

FacEDiT: Unified Talking Face Editing and Generation via Facial Motion Infilling

Add code
Dec 16, 2025
Viaarxiv icon

Patch-wise Retrieval: A Bag of Practical Techniques for Instance-level Matching

Add code
Dec 14, 2025
Viaarxiv icon

PAVAS: Physics-Aware Video-to-Audio Synthesis

Add code
Dec 09, 2025
Viaarxiv icon

RetouchLLM: Training-free White-box Image Retouching

Add code
Oct 09, 2025
Viaarxiv icon

VSC: Visual Search Compositional Text-to-Image Diffusion Model

Add code
May 02, 2025
Viaarxiv icon

JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers

Add code
May 01, 2025
Viaarxiv icon

AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation

Add code
Apr 29, 2025
Viaarxiv icon

VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models

Add code
Apr 03, 2025
Viaarxiv icon

Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics

Add code
Mar 27, 2025
Figure 1 for Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Figure 2 for Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Figure 3 for Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Figure 4 for Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
Viaarxiv icon