Picture for Zhen Lei

Zhen Lei

Pose-RFT: Enhancing MLLMs for 3D Pose Generation via Hybrid Action Reinforcement Fine-Tuning

Add code
Aug 11, 2025
Viaarxiv icon

MM2CT: MR-to-CT translation for multi-modal image fusion with mamba

Add code
Aug 07, 2025
Viaarxiv icon

Multimodal Causal-Driven Representation Learning for Generalizable Medical Image Segmentation

Add code
Aug 07, 2025
Viaarxiv icon

F2PASeg: Feature Fusion for Pituitary Anatomy Segmentation in Endoscopic Surgery

Add code
Aug 07, 2025
Viaarxiv icon

Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation

Add code
Jun 06, 2025
Viaarxiv icon

SA-Person: Text-Based Person Retrieval with Scene-aware Re-ranking

Add code
May 30, 2025
Viaarxiv icon

From Data to Modeling: Fully Open-vocabulary Scene Graph Generation

Add code
May 26, 2025
Viaarxiv icon

Benchmarking Unified Face Attack Detection via Hierarchical Prompt Tuning

Add code
May 19, 2025
Viaarxiv icon

MLLM-Enhanced Face Forgery Detection: A Vision-Language Fusion Solution

Add code
May 04, 2025
Viaarxiv icon

Compile Scene Graphs with Reinforcement Learning

Add code
Apr 18, 2025
Viaarxiv icon