Picture for Yifan Yang

Yifan Yang

CLEAR: Context-Aware Learning with End-to-End Mask-Free Inference for Adaptive Video Subtitle Removal

Add code
Mar 23, 2026
Viaarxiv icon

Satellite-to-Street: Synthesizing Post-Disaster Views from Satellite Imagery via Generative Vision Models

Add code
Mar 21, 2026
Viaarxiv icon

Em-Garde: A Propose-Match Framework for Proactive Streaming Video Understanding

Add code
Mar 19, 2026
Viaarxiv icon

DamageArbiter: A CLIP-Enhanced Multimodal Arbitration Framework for Hurricane Damage Assessment from Street-View Imagery

Add code
Mar 16, 2026
Viaarxiv icon

SLICE: Semantic Latent Injection via Compartmentalized Embedding for Image Watermarking

Add code
Mar 13, 2026
Viaarxiv icon

EvoTok: A Unified Image Tokenizer via Residual Latent Evolution for Visual Understanding and Generation

Add code
Mar 12, 2026
Viaarxiv icon

Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution

Add code
Mar 05, 2026
Viaarxiv icon

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Add code
Mar 03, 2026
Viaarxiv icon

AoE: Always-on Egocentric Human Video Collection for Embodied AI

Add code
Mar 02, 2026
Viaarxiv icon

CT-Bench: A Benchmark for Multimodal Lesion Understanding in Computed Tomography

Add code
Feb 16, 2026
Viaarxiv icon