Picture for Kaizhu Huang

Kaizhu Huang

Towards Faithful Reasoning in Comics for Small MLLMs

Add code
Jan 06, 2026
Viaarxiv icon

Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?

Add code
Oct 16, 2025
Viaarxiv icon

The Demon is in Ambiguity: Revisiting Situation Recognition with Single Positive Multi-Label Learning

Add code
Aug 29, 2025
Figure 1 for The Demon is in Ambiguity: Revisiting Situation Recognition with Single Positive Multi-Label Learning
Figure 2 for The Demon is in Ambiguity: Revisiting Situation Recognition with Single Positive Multi-Label Learning
Figure 3 for The Demon is in Ambiguity: Revisiting Situation Recognition with Single Positive Multi-Label Learning
Figure 4 for The Demon is in Ambiguity: Revisiting Situation Recognition with Single Positive Multi-Label Learning
Viaarxiv icon

Exploiting Layer Normalization Fine-tuning in Visual Transformer Foundation Models for Classification

Add code
Aug 11, 2025
Viaarxiv icon

GeoSDF: Plane Geometry Diagram Synthesis via Signed Distance Field

Add code
Jun 16, 2025
Viaarxiv icon

DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model

Add code
May 28, 2025
Figure 1 for DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model
Figure 2 for DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model
Figure 3 for DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model
Figure 4 for DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model
Viaarxiv icon

Interpretable Zero-shot Learning with Infinite Class Concepts

Add code
May 06, 2025
Figure 1 for Interpretable Zero-shot Learning with Infinite Class Concepts
Figure 2 for Interpretable Zero-shot Learning with Infinite Class Concepts
Figure 3 for Interpretable Zero-shot Learning with Infinite Class Concepts
Figure 4 for Interpretable Zero-shot Learning with Infinite Class Concepts
Viaarxiv icon

Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait

Add code
Mar 17, 2025
Viaarxiv icon

BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis

Add code
Mar 16, 2025
Figure 1 for BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis
Figure 2 for BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis
Figure 3 for BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis
Figure 4 for BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis
Viaarxiv icon

Consistency Diffusion Models for Single-Image 3D Reconstruction with Priors

Add code
Jan 31, 2025
Viaarxiv icon