Picture for Ran He

Ran He

InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing

Add code
Aug 19, 2025
Viaarxiv icon

Adapting Vision-Language Models Without Labels: A Comprehensive Survey

Add code
Aug 07, 2025
Viaarxiv icon

Test-Time Immunization: A Universal Defense Framework Against Jailbreaks for (Multimodal) Large Language Models

Add code
May 28, 2025
Viaarxiv icon

HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling

Add code
May 27, 2025
Viaarxiv icon

T^2Agent A Tool-augmented Multimodal Misinformation Detection Agent with Monte Carlo Tree Search

Add code
May 26, 2025
Viaarxiv icon

Breaking Complexity Barriers: High-Resolution Image Restoration with Rank Enhanced Linear Attention

Add code
May 22, 2025
Viaarxiv icon

Unlocking the Potential of Difficulty Prior in RL-based Multimodal Reasoning

Add code
May 19, 2025
Viaarxiv icon

NOFT: Test-Time Noise Finetune via Information Bottleneck for Highly Correlated Asset Creation

Add code
May 18, 2025
Viaarxiv icon

Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs

Add code
May 17, 2025
Viaarxiv icon

DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling

Add code
May 16, 2025
Viaarxiv icon