Picture for Yong Jae Lee

Yong Jae Lee

Reasoning-Augmented Representations for Multimodal Retrieval

Add code
Feb 06, 2026
Viaarxiv icon

Agentic Very Long Video Understanding

Add code
Jan 26, 2026
Viaarxiv icon

VideoWeave: A Data-Centric Approach for Efficient Video Understanding

Add code
Jan 09, 2026
Viaarxiv icon

Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration

Add code
Dec 11, 2025
Figure 1 for Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Figure 2 for Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Figure 3 for Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Figure 4 for Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Viaarxiv icon

Relational Visual Similarity

Add code
Dec 08, 2025
Figure 1 for Relational Visual Similarity
Figure 2 for Relational Visual Similarity
Figure 3 for Relational Visual Similarity
Figure 4 for Relational Visual Similarity
Viaarxiv icon

Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark

Add code
Nov 17, 2025
Figure 1 for Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
Figure 2 for Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
Figure 3 for Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
Figure 4 for Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
Viaarxiv icon

Contamination Detection for VLMs using Multi-Modal Semantic Perturbation

Add code
Nov 05, 2025
Viaarxiv icon

Real Deep Research for AI, Robotics and Beyond

Add code
Oct 23, 2025
Figure 1 for Real Deep Research for AI, Robotics and Beyond
Figure 2 for Real Deep Research for AI, Robotics and Beyond
Figure 3 for Real Deep Research for AI, Robotics and Beyond
Figure 4 for Real Deep Research for AI, Robotics and Beyond
Viaarxiv icon

CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems

Add code
Jun 09, 2025
Viaarxiv icon

UniTalk: Towards Universal Active Speaker Detection in Real World Scenarios

Add code
May 28, 2025
Viaarxiv icon