Picture for Jianzong Wang

Jianzong Wang

WindowQuant: Mixed-Precision KV Cache Quantization based on Window-Level Similarity for VLMs Inference Optimization

Add code
May 04, 2026
Viaarxiv icon

Evolvable Embodied Agent for Robotic Manipulation via Long Short-Term Reflection and Optimization

Add code
Apr 15, 2026
Viaarxiv icon

VLA-InfoEntropy: A Training-Free Vision-Attention Information Entropy Approach for Vision-Language-Action Models Inference Acceleration and Success

Add code
Apr 07, 2026
Viaarxiv icon

Confusion-Aware In-Context-Learning for Vision-Language Models in Robotic Manipulation

Add code
Mar 16, 2026
Viaarxiv icon

Vista: Scene-Aware Optimization for Streaming Video Question Answering under Post-Hoc Queries

Add code
Feb 09, 2026
Viaarxiv icon

Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition

Add code
Feb 02, 2026
Viaarxiv icon

From Knowing to Doing Precisely: A General Self-Correction and Termination Framework for VLA models

Add code
Feb 02, 2026
Viaarxiv icon

Triage: Hierarchical Visual Budgeting for Efficient Video Reasoning in Vision-Language Models

Add code
Jan 30, 2026
Viaarxiv icon

CARE: Multi-Task Pretraining for Latent Continuous Action Representation in Robot Control

Add code
Jan 30, 2026
Viaarxiv icon

MiTa: A Hierarchical Multi-Agent Collaboration Framework with Memory-integrated and Task Allocation

Add code
Jan 30, 2026
Viaarxiv icon