Picture for Jinqiao Wang

Jinqiao Wang

Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences, objecteye.Inc

ST-Prune: Training-Free Spatio-Temporal Token Pruning for Vision-Language Models in Autonomous Driving

Add code
Apr 21, 2026
Viaarxiv icon

Decoupled Similarity for Task-Aware Token Pruning in Large Vision-Language Models

Add code
Apr 13, 2026
Viaarxiv icon

Semantic Noise Reduction via Teacher-Guided Dual-Path Audio-Visual Representation Learning

Add code
Apr 09, 2026
Viaarxiv icon

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Add code
Apr 02, 2026
Viaarxiv icon

PLUME: Latent Reasoning Based Universal Multimodal Embedding

Add code
Apr 02, 2026
Viaarxiv icon

Listening with the Eyes: Benchmarking Egocentric Co-Speech Grounding across Space and Time

Add code
Mar 09, 2026
Viaarxiv icon

TRACE: Task-Adaptive Reasoning and Representation Learning for Universal Multimodal Retrieval

Add code
Mar 04, 2026
Viaarxiv icon

WISER: Wider Search, Deeper Thinking, and Adaptive Fusion for Training-Free Zero-Shot Composed Image Retrieval

Add code
Feb 26, 2026
Viaarxiv icon

TraceVision: Trajectory-Aware Vision-Language Model for Human-Like Spatial Understanding

Add code
Feb 24, 2026
Viaarxiv icon

R-Diverse: Mitigating Diversity Illusion in Self-Play LLM Training

Add code
Feb 16, 2026
Viaarxiv icon