Picture for Yaowei Wang

Yaowei Wang

Interactive Tracking: A Human-in-the-Loop Paradigm with Memory-Augmented Adaptation

Add code
Apr 02, 2026
Viaarxiv icon

A Step Toward Federated Pretraining of Multimodal Large Language Models

Add code
Mar 25, 2026
Viaarxiv icon

Cluster-Wise Spatio-Temporal Masking for Efficient Video-Language Pretraining

Add code
Mar 24, 2026
Viaarxiv icon

From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents

Add code
Mar 02, 2026
Viaarxiv icon

EPRBench: A High-Quality Benchmark Dataset for Event Stream Based Visual Place Recognition

Add code
Feb 13, 2026
Viaarxiv icon

DeltaKV: Residual-Based KV Cache Compression via Long-Range Similarity

Add code
Feb 08, 2026
Viaarxiv icon

MirrorLA: Reflecting Feature Map for Vision Linear Attention

Add code
Feb 04, 2026
Viaarxiv icon

Seeing Through the Chain: Mitigate Hallucination in Multimodal Reasoning Models via CoT Compression and Contrastive Preference Optimization

Add code
Feb 03, 2026
Viaarxiv icon

CLIP-Guided Adaptable Self-Supervised Learning for Human-Centric Visual Tasks

Add code
Jan 19, 2026
Viaarxiv icon

Splatwizard: A Benchmark Toolkit for 3D Gaussian Splatting Compression

Add code
Dec 31, 2025
Viaarxiv icon