Picture for Chen Zhao

Chen Zhao

Action Draft and Verify: A Self-Verifying Framework for Vision-Language-Action Model

Add code
Mar 18, 2026
Viaarxiv icon

MVHOI: Bridge Multi-view Condition to Complex Human-Object Interaction Video Reenactment via 3D Foundation Model

Add code
Mar 16, 2026
Viaarxiv icon

VisualLeakBench: Auditing the Fragility of Large Vision-Language Models against PII Leakage and Social Engineering

Add code
Mar 11, 2026
Viaarxiv icon

Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

Add code
Mar 10, 2026
Viaarxiv icon

Deconstructing Multimodal Mathematical Reasoning: Towards a Unified Perception-Alignment-Reasoning Paradigm

Add code
Mar 09, 2026
Viaarxiv icon

GarmentPainter: Efficient 3D Garment Texture Synthesis with Character-Guided Diffusion Model

Add code
Mar 09, 2026
Viaarxiv icon

RnG: A Unified Transformer for Complete 3D Modeling from Partial Observations

Add code
Mar 01, 2026
Viaarxiv icon

RPDR: A Round-trip Prediction-Based Data Augmentation Framework for Long-Tail Question Answering

Add code
Feb 19, 2026
Viaarxiv icon

Imaging-Derived Coronary Fractional Flow Reserve: Advances in Physics-Based, Machine-Learning, and Physics-Informed Methods

Add code
Feb 17, 2026
Viaarxiv icon

LUVE : Latent-Cascaded Ultra-High-Resolution Video Generation with Dual Frequency Experts

Add code
Feb 12, 2026
Viaarxiv icon