Picture for Zeyu Zhang

Zeyu Zhang

FlashSign: Pose-Free Guidance for Efficient Sign Language Video Generation

Add code
Mar 30, 2026
Viaarxiv icon

Rethinking Token Pruning for Historical Screenshots in GUI Visual Agents: Semantic, Spatial, and Temporal Perspectives

Add code
Mar 27, 2026
Viaarxiv icon

MWM: Mobile World Models for Action-Conditioned Consistent Prediction

Add code
Mar 08, 2026
Viaarxiv icon

GeoWorld: Geometric World Models

Add code
Feb 26, 2026
Viaarxiv icon

OmniOCR: Generalist OCR for Ethnic Minority Languages

Add code
Feb 24, 2026
Viaarxiv icon

OCR-Agent: Agentic OCR with Capability and Memory Reflection

Add code
Feb 24, 2026
Viaarxiv icon

Decoupling Defense Strategies for Robust Image Watermarking

Add code
Feb 23, 2026
Viaarxiv icon

All Leaks Count, Some Count More: Interpretable Temporal Contamination Detection in LLM Backtesting

Add code
Feb 19, 2026
Viaarxiv icon

PartRAG: Retrieval-Augmented Part-Level 3D Generation and Editing

Add code
Feb 19, 2026
Viaarxiv icon

MMA: Multimodal Memory Agent

Add code
Feb 18, 2026
Viaarxiv icon