Picture for Yangyang Zhong

Yangyang Zhong

TextVidBench: A Benchmark for Long Video Scene Text Understanding

Add code
Jun 05, 2025
Viaarxiv icon

ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation

Add code
May 22, 2025
Viaarxiv icon

Adept: Annotation-Denoising Auxiliary Tasks with Discrete Cosine Transform Map and Keypoint for Human-Centric Pretraining

Add code
Apr 29, 2025
Viaarxiv icon

Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations

Add code
Jun 21, 2024
Viaarxiv icon