Picture for Shiming Xiang

Shiming Xiang

SeaVIS: Sound-Enhanced Association for Online Audio-Visual Instance Segmentation

Add code
Mar 02, 2026
Viaarxiv icon

HVR-Met: A Hypothesis-Verification-Replaning Agentic System for Extreme Weather Diagnosis

Add code
Mar 01, 2026
Viaarxiv icon

CC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering

Add code
Feb 27, 2026
Viaarxiv icon

InfEngine: A Self-Verifying and Self-Optimizing Intelligent Engine for Infrared Radiation Computing

Add code
Feb 22, 2026
Viaarxiv icon

Beyond Next-Token Alignment: Distilling Multimodal Large Language Models via Token Interactions

Add code
Feb 10, 2026
Viaarxiv icon

Enhanced Graph Transformer with Serialized Graph Tokens

Add code
Feb 09, 2026
Viaarxiv icon

DSFC-Net: A Dual-Encoder Spatial and Frequency Co-Awareness Network for Rural Road Extraction

Add code
Feb 01, 2026
Viaarxiv icon

QE-Catalytic: A Graph-Language Multimodal Base Model for Relaxed-Energy Prediction in Catalytic Adsorption

Add code
Dec 23, 2025
Viaarxiv icon

IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting

Add code
Dec 10, 2025
Viaarxiv icon

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Add code
Aug 28, 2025
Figure 1 for R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
Figure 2 for R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
Figure 3 for R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
Figure 4 for R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
Viaarxiv icon