Picture for Byeonghyun Pak

Byeonghyun Pak

Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition

Add code
Mar 14, 2026
Viaarxiv icon

Aligning Forest and Trees in Images and Long Captions for Visually Grounded Understanding

Add code
Feb 03, 2026
Viaarxiv icon

Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration

Add code
Nov 06, 2025
Viaarxiv icon

Textual Query-Driven Mask Transformer for Domain Generalized Segmentation

Add code
Jul 12, 2024
Figure 1 for Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Figure 2 for Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Figure 3 for Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Figure 4 for Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Viaarxiv icon