Picture for Size Wu

Size Wu

UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing

Add code
Feb 02, 2026
Viaarxiv icon

Skywork UniPic 3.0: Unified Multi-Image Composition via Sequence Modeling

Add code
Jan 22, 2026
Viaarxiv icon

RecTok: Reconstruction Distillation along Rectified Flow

Add code
Dec 17, 2025
Viaarxiv icon

Generative Photographic Control for Scene-Consistent Video Cinematic Editing

Add code
Nov 17, 2025
Viaarxiv icon

OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation

Add code
May 29, 2025
Viaarxiv icon

Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

Add code
Mar 27, 2025
Figure 1 for Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Figure 2 for Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Figure 3 for Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Figure 4 for Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Viaarxiv icon

F-LMM: Grounding Frozen Large Multimodal Models

Add code
Jun 09, 2024
Figure 1 for F-LMM: Grounding Frozen Large Multimodal Models
Figure 2 for F-LMM: Grounding Frozen Large Multimodal Models
Figure 3 for F-LMM: Grounding Frozen Large Multimodal Models
Figure 4 for F-LMM: Grounding Frozen Large Multimodal Models
Viaarxiv icon

OMG-Seg: Is One Model Good Enough For All Segmentation?

Add code
Jan 18, 2024
Figure 1 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 2 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 3 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 4 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Viaarxiv icon

CLIM: Contrastive Language-Image Mosaic for Region Representation

Add code
Dec 19, 2023
Viaarxiv icon

DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection

Add code
Oct 02, 2023
Figure 1 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 2 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 3 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 4 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Viaarxiv icon