Picture for Haicheng Wang

Haicheng Wang

Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training

Add code
Apr 20, 2026
Viaarxiv icon

POINTS-Long: Adaptive Dual-Mode Visual Reasoning in MLLMs

Add code
Apr 13, 2026
Viaarxiv icon

VersaViT: Enhancing MLLM Vision Backbones via Task-Guided Optimization

Add code
Feb 10, 2026
Viaarxiv icon

POINTS-GUI-G: GUI-Grounding Journey

Add code
Feb 06, 2026
Viaarxiv icon

Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and Findings

Add code
May 30, 2025
Viaarxiv icon

Squeeze Out Tokens from Sample for Finer-Grained Data Governance

Add code
Mar 18, 2025
Viaarxiv icon

Contrast-Unity for Partially-Supervised Temporal Sentence Grounding

Add code
Feb 18, 2025
Figure 1 for Contrast-Unity for Partially-Supervised Temporal Sentence Grounding
Figure 2 for Contrast-Unity for Partially-Supervised Temporal Sentence Grounding
Figure 3 for Contrast-Unity for Partially-Supervised Temporal Sentence Grounding
Figure 4 for Contrast-Unity for Partially-Supervised Temporal Sentence Grounding
Viaarxiv icon

FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance

Add code
Jan 05, 2025
Figure 1 for FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Figure 2 for FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Figure 3 for FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Figure 4 for FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Viaarxiv icon

Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training

Add code
Nov 30, 2024
Figure 1 for Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Figure 2 for Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Figure 3 for Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Figure 4 for Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Viaarxiv icon

DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition

Add code
Apr 23, 2024
Figure 1 for DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
Figure 2 for DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
Figure 3 for DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
Figure 4 for DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
Viaarxiv icon