Picture for Yong Jae Lee

Yong Jae Lee

From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing

Add code
May 14, 2026
Viaarxiv icon

Exploration and Exploitation Errors Are Measurable for Language Model Agents

Add code
Apr 14, 2026
Viaarxiv icon

MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models

Add code
Mar 26, 2026
Viaarxiv icon

Unified Spatio-Temporal Token Scoring for Efficient Video VLMs

Add code
Mar 18, 2026
Viaarxiv icon

Spatially Grounded Long-Horizon Task Planning in the Wild

Add code
Mar 13, 2026
Viaarxiv icon

Reasoning-Augmented Representations for Multimodal Retrieval

Add code
Feb 06, 2026
Viaarxiv icon

Agentic Very Long Video Understanding

Add code
Jan 26, 2026
Viaarxiv icon

VideoWeave: A Data-Centric Approach for Efficient Video Understanding

Add code
Jan 09, 2026
Viaarxiv icon

Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration

Add code
Dec 11, 2025
Figure 1 for Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Figure 2 for Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Figure 3 for Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Figure 4 for Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Viaarxiv icon

Relational Visual Similarity

Add code
Dec 08, 2025
Figure 1 for Relational Visual Similarity
Figure 2 for Relational Visual Similarity
Figure 3 for Relational Visual Similarity
Figure 4 for Relational Visual Similarity
Viaarxiv icon