Picture for Kaihang Pan

Kaihang Pan

Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning

Add code
Apr 22, 2025
Viaarxiv icon

Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens

Add code
Apr 20, 2025
Viaarxiv icon

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Add code
Dec 13, 2024
Figure 1 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 2 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 3 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Figure 4 for Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
Viaarxiv icon

STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

Add code
Nov 29, 2024
Viaarxiv icon

AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea

Add code
Nov 24, 2024
Figure 1 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 2 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 3 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Figure 4 for AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Viaarxiv icon

Unified Generative and Discriminative Training for Multi-modal Large Language Models

Add code
Nov 01, 2024
Figure 1 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 2 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 3 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Figure 4 for Unified Generative and Discriminative Training for Multi-modal Large Language Models
Viaarxiv icon

Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration

Add code
Sep 30, 2024
Viaarxiv icon

Auto-Encoding Morph-Tokens for Multimodal LLM

Add code
May 03, 2024
Viaarxiv icon

Improving Vision Anomaly Detection with the Guidance of Language Modality

Add code
Oct 04, 2023
Viaarxiv icon

ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval

Add code
Aug 19, 2023
Figure 1 for ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval
Figure 2 for ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval
Figure 3 for ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval
Figure 4 for ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval
Viaarxiv icon