Picture for Kai Han

Kai Han

and Other Contributors

GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models

Add code
Aug 21, 2024
Figure 1 for GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models
Figure 2 for GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models
Figure 3 for GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models
Figure 4 for GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models
Viaarxiv icon

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning

Add code
Aug 13, 2024
Viaarxiv icon

HiLo: A Learning Framework for Generalized Category Discovery Robust to Domain Shifts

Add code
Aug 08, 2024
Viaarxiv icon

LatentArtiFusion: An Effective and Efficient Histological Artifacts Restoration Framework

Add code
Jul 29, 2024
Viaarxiv icon

PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery

Add code
Jul 26, 2024
Viaarxiv icon

RegionDrag: Fast Region-Based Image Editing with Diffusion Models

Add code
Jul 25, 2024
Figure 1 for RegionDrag: Fast Region-Based Image Editing with Diffusion Models
Figure 2 for RegionDrag: Fast Region-Based Image Editing with Diffusion Models
Figure 3 for RegionDrag: Fast Region-Based Image Editing with Diffusion Models
Figure 4 for RegionDrag: Fast Region-Based Image Editing with Diffusion Models
Viaarxiv icon

ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction

Add code
Jul 09, 2024
Viaarxiv icon

A Survey on 3D Human Avatar Modeling -- From Reconstruction to Generation

Add code
Jun 06, 2024
Figure 1 for A Survey on 3D Human Avatar Modeling -- From Reconstruction to Generation
Figure 2 for A Survey on 3D Human Avatar Modeling -- From Reconstruction to Generation
Figure 3 for A Survey on 3D Human Avatar Modeling -- From Reconstruction to Generation
Figure 4 for A Survey on 3D Human Avatar Modeling -- From Reconstruction to Generation
Viaarxiv icon

GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer

Add code
Jun 04, 2024
Figure 1 for GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Figure 2 for GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Figure 3 for GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Figure 4 for GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Viaarxiv icon

SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

Add code
May 28, 2024
Viaarxiv icon