Picture for Yuanhuiyi Lyu

Yuanhuiyi Lyu

TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

Add code
Apr 08, 2026
Viaarxiv icon

SAP: Segment Any 4K Panorama

Add code
Mar 13, 2026
Viaarxiv icon

EgoIntent: An Egocentric Step-level Benchmark for Understanding What, Why, and Next

Add code
Mar 12, 2026
Viaarxiv icon

Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval

Add code
Feb 23, 2026
Viaarxiv icon

T-Rex-Omni: Integrating Negative Visual Prompt in Generic Object Detection

Add code
Nov 12, 2025
Viaarxiv icon

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

Add code
Oct 29, 2025
Viaarxiv icon

AI for Service: Proactive Assistance with AI Glasses

Add code
Oct 16, 2025
Viaarxiv icon

PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs

Add code
Oct 10, 2025
Viaarxiv icon

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods

Add code
Oct 08, 2025
Viaarxiv icon

PANORAMA: The Rise of Omnidirectional Vision in the Embodied AI Era

Add code
Sep 16, 2025
Viaarxiv icon