Picture for Zhengyao Fang

Zhengyao Fang

PhoneBuddy: Training Open Models for Agentic Phone Use

Add code
Jun 24, 2026
Viaarxiv icon

PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

Add code
Jun 12, 2026
Viaarxiv icon

PhoneWorld: Scaling Phone-Use Agent Environments

Add code
May 28, 2026
Viaarxiv icon

Manifold-Optimal Guidance: A Unified Riemannian Control View of Diffusion Guidance

Add code
Mar 12, 2026
Viaarxiv icon

Too Vivid to Be Real? Benchmarking and Calibrating Generative Color Fidelity

Add code
Mar 11, 2026
Viaarxiv icon

Prune Redundancy, Preserve Essence: Vision Token Compression in VLMs via Synergistic Importance-Diversity

Add code
Mar 11, 2026
Viaarxiv icon

Recognition-Synergistic Scene Text Editing

Add code
Mar 11, 2025
Figure 1 for Recognition-Synergistic Scene Text Editing
Figure 2 for Recognition-Synergistic Scene Text Editing
Figure 3 for Recognition-Synergistic Scene Text Editing
Figure 4 for Recognition-Synergistic Scene Text Editing
Viaarxiv icon

WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting

Add code
Jul 28, 2024
Viaarxiv icon