Picture for Lei Zhu

Lei Zhu

EventRR: Event Referential Reasoning for Referring Video Object Segmentation

Add code
Aug 10, 2025
Viaarxiv icon

Think Before You Talk: Enhancing Meaningful Dialogue Generation in Full-Duplex Speech Language Models with Planning-Inspired Text Guidance

Add code
Aug 10, 2025
Viaarxiv icon

S2-UniSeg: Fast Universal Agglomerative Pooling for Scalable Segment Anything without Supervision

Add code
Aug 09, 2025
Viaarxiv icon

Unified modality separation: A vision-language framework for unsupervised domain adaptation

Add code
Aug 07, 2025
Viaarxiv icon

HRVVS: A High-resolution Video Vasculature Segmentation Network via Hierarchical Autoregressive Residual Priors

Add code
Jul 30, 2025
Viaarxiv icon

MaskedCLIP: Bridging the Masked and CLIP Space for Semi-Supervised Medical Vision-Language Pre-training

Add code
Jul 23, 2025
Viaarxiv icon

The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs

Add code
Jul 10, 2025
Viaarxiv icon

AdvMIM: Adversarial Masked Image Modeling for Semi-Supervised Medical Image Segmentation

Add code
Jun 25, 2025
Viaarxiv icon

Surgery-R1: Advancing Surgical-VQLA with Reasoning Multimodal Large Language Model via Reinforcement Learning

Add code
Jun 24, 2025
Viaarxiv icon

PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

Add code
Jun 12, 2025
Viaarxiv icon