Picture for Jungong Han

Jungong Han

Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning

Add code
Feb 22, 2026
Viaarxiv icon

SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos

Add code
Dec 09, 2025
Viaarxiv icon

PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning

Add code
Oct 22, 2025
Viaarxiv icon

Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model

Add code
Sep 09, 2025
Figure 1 for Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
Figure 2 for Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
Figure 3 for Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
Figure 4 for Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
Viaarxiv icon

Unlocking the Potential of Diffusion Priors in Blind Face Restoration

Add code
Aug 12, 2025
Figure 1 for Unlocking the Potential of Diffusion Priors in Blind Face Restoration
Figure 2 for Unlocking the Potential of Diffusion Priors in Blind Face Restoration
Figure 3 for Unlocking the Potential of Diffusion Priors in Blind Face Restoration
Figure 4 for Unlocking the Potential of Diffusion Priors in Blind Face Restoration
Viaarxiv icon

Modality-Aware Feature Matching: A Comprehensive Review of Single- and Cross-Modality Techniques

Add code
Jul 30, 2025
Figure 1 for Modality-Aware Feature Matching: A Comprehensive Review of Single- and Cross-Modality Techniques
Figure 2 for Modality-Aware Feature Matching: A Comprehensive Review of Single- and Cross-Modality Techniques
Figure 3 for Modality-Aware Feature Matching: A Comprehensive Review of Single- and Cross-Modality Techniques
Figure 4 for Modality-Aware Feature Matching: A Comprehensive Review of Single- and Cross-Modality Techniques
Viaarxiv icon

DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval

Add code
Jun 10, 2025
Viaarxiv icon

THU-Warwick Submission for EPIC-KITCHEN Challenge 2025: Semi-Supervised Video Object Segmentation

Add code
Jun 07, 2025
Viaarxiv icon

Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts

Add code
Jun 05, 2025
Figure 1 for Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts
Figure 2 for Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts
Figure 3 for Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts
Figure 4 for Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts
Viaarxiv icon

AdaTP: Attention-Debiased Token Pruning for Video Large Language Models

Add code
May 26, 2025
Viaarxiv icon