Picture for Ming Sun

Ming Sun

Pioneering Perceptual Video Fluency Assessment: A Novel Task with Benchmark Dataset and Baseline

Add code
Mar 27, 2026
Viaarxiv icon

Tuning Real-World Image Restoration at Inference: A Test-Time Scaling Paradigm for Flow Matching Models

Add code
Mar 23, 2026
Viaarxiv icon

ShiftLUT: Spatial Shift Enhanced Look-Up Tables for Efficient Image Restoration

Add code
Mar 03, 2026
Viaarxiv icon

Equipping LLM with Directional Multi-Talker Speech Understanding Capabilities

Add code
Feb 06, 2026
Viaarxiv icon

SA-VLA: Spatially-Aware Flow-Matching for Vision-Language-Action Reinforcement Learning

Add code
Jan 31, 2026
Viaarxiv icon

InstantViR: Real-Time Video Inverse Problem Solver with Distilled Diffusion Prior

Add code
Nov 18, 2025
Viaarxiv icon

Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses

Add code
Sep 17, 2025
Viaarxiv icon

Generating Query-Relevant Document Summaries via Reinforcement Learning

Add code
Aug 11, 2025
Viaarxiv icon

Bridging Video Quality Scoring and Justification via Large Multimodal Models

Add code
Jun 26, 2025
Viaarxiv icon

Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition

Add code
Jun 17, 2025
Figure 1 for Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition
Figure 2 for Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition
Figure 3 for Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition
Figure 4 for Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition
Viaarxiv icon