Picture for Kun Yuan

Kun Yuan

Bridging Video Quality Scoring and Justification via Large Multimodal Models

Add code
Jun 26, 2025
Viaarxiv icon

Efficient Long-Context LLM Inference via KV Cache Clustering

Add code
Jun 13, 2025
Viaarxiv icon

EndoVLA: Dual-Phase Vision-Language-Action Model for Autonomous Tracking in Endoscopy

Add code
May 21, 2025
Viaarxiv icon

ORQA: A Benchmark and Foundation Model for Holistic Operating Room Modeling

Add code
May 19, 2025
Viaarxiv icon

Diffusion Learning with Partial Agent Participation and Local Updates

Add code
May 16, 2025
Viaarxiv icon

NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: KwaiSR Dataset and Study

Add code
Apr 21, 2025
Viaarxiv icon

Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery

Add code
Apr 02, 2025
Viaarxiv icon

Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity

Add code
Mar 20, 2025
Viaarxiv icon

KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception

Add code
Mar 13, 2025
Viaarxiv icon

MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments

Add code
Mar 04, 2025
Viaarxiv icon