Picture for Zhen Lei

Zhen Lei

Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation

Add code
Jun 06, 2025
Viaarxiv icon

SA-Person: Text-Based Person Retrieval with Scene-aware Re-ranking

Add code
May 30, 2025
Viaarxiv icon

From Data to Modeling: Fully Open-vocabulary Scene Graph Generation

Add code
May 26, 2025
Viaarxiv icon

Benchmarking Unified Face Attack Detection via Hierarchical Prompt Tuning

Add code
May 19, 2025
Viaarxiv icon

MLLM-Enhanced Face Forgery Detection: A Vision-Language Fusion Solution

Add code
May 04, 2025
Viaarxiv icon

Compile Scene Graphs with Reinforcement Learning

Add code
Apr 18, 2025
Viaarxiv icon

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Add code
Apr 01, 2025
Viaarxiv icon

FA^{3}-CLIP: Frequency-Aware Cues Fusion and Attack-Agnostic Prompt Learning for Unified Face Attack Detection

Add code
Apr 01, 2025
Viaarxiv icon

Mixture-of-Attack-Experts with Class Regularization for Unified Physical-Digital Face Attack Detection

Add code
Apr 01, 2025
Viaarxiv icon

Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data

Add code
Mar 27, 2025
Viaarxiv icon