Picture for Yunhong Wang

Yunhong Wang

ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations

Add code
May 29, 2025
Viaarxiv icon

TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments

Add code
May 23, 2025
Viaarxiv icon

ToolSpectrum : Towards Personalized Tool Utilization for Large Language Models

Add code
May 19, 2025
Viaarxiv icon

DDAE++: Enhancing Diffusion Models Towards Unified Generative and Discriminative Learning

Add code
May 16, 2025
Viaarxiv icon

GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art

Add code
May 16, 2025
Viaarxiv icon

Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion

Add code
May 16, 2025
Viaarxiv icon

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

Add code
Apr 30, 2025
Viaarxiv icon

SkeletonX: Data-Efficient Skeleton-based Action Recognition via Cross-sample Feature Aggregation

Add code
Apr 16, 2025
Viaarxiv icon

Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation

Add code
Apr 13, 2025
Viaarxiv icon

APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers

Add code
Apr 03, 2025
Viaarxiv icon