Picture for Kai Han

Kai Han

Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping

Add code
Mar 10, 2025
Viaarxiv icon

DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning

Add code
Mar 09, 2025
Viaarxiv icon

ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval

Add code
Feb 21, 2025
Viaarxiv icon

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Add code
Jan 21, 2025
Figure 1 for Parallel Sequence Modeling via Generalized Spatial Propagation Network
Figure 2 for Parallel Sequence Modeling via Generalized Spatial Propagation Network
Figure 3 for Parallel Sequence Modeling via Generalized Spatial Propagation Network
Figure 4 for Parallel Sequence Modeling via Generalized Spatial Propagation Network
Viaarxiv icon

VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models

Add code
Jan 21, 2025
Figure 1 for VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models
Figure 2 for VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models
Figure 3 for VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models
Figure 4 for VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models
Viaarxiv icon

Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts

Add code
Jan 08, 2025
Figure 1 for Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
Figure 2 for Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
Figure 3 for Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
Figure 4 for Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
Viaarxiv icon

PruneVid: Visual Token Pruning for Efficient Video Large Language Models

Add code
Dec 20, 2024
Viaarxiv icon

OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization

Add code
Dec 19, 2024
Figure 1 for OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization
Figure 2 for OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization
Figure 3 for OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization
Figure 4 for OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization
Viaarxiv icon

Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games

Add code
Dec 18, 2024
Figure 1 for Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games
Figure 2 for Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games
Figure 3 for Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games
Figure 4 for Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games
Viaarxiv icon

Mr. DETR: Instructive Multi-Route Training for Detection Transformers

Add code
Dec 13, 2024
Viaarxiv icon