Picture for Erwei Yin

Erwei Yin

Event-Causal RAG: A Retrieval-Augmented Generation Framework for Long Video Reasoning in Complex Scenarios

Add code
May 07, 2026
Viaarxiv icon

DBMIF: a deep balanced multimodal iterative fusion framework for air- and bone-conduction speech enhancement

Add code
Mar 03, 2026
Viaarxiv icon

Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition

Add code
Jan 18, 2026
Viaarxiv icon

OMG-Bench: A New Challenging Benchmark for Skeleton-based Online Micro Hand Gesture Recognition

Add code
Dec 18, 2025
Viaarxiv icon

AFD-SLU: Adaptive Feature Distillation for Spoken Language Understanding

Add code
Sep 05, 2025
Viaarxiv icon

MMME: A Spontaneous Multi-Modal Micro-Expression Dataset Enabling Visual-Physiological Fusion

Add code
Jun 12, 2025
Viaarxiv icon

MPFNet: A Multi-Prior Fusion Network with a Progressive Training Strategy for Micro-Expression Recognition

Add code
Jun 11, 2025
Viaarxiv icon

Generating Vision-Language Navigation Instructions Incorporated Fine-Grained Alignment Annotations

Add code
Jun 10, 2025
Viaarxiv icon

ST-Booster: An Iterative SpatioTemporal Perception Booster for Vision-and-Language Navigation in Continuous Environments

Add code
Apr 14, 2025
Viaarxiv icon

PanoGen++: Domain-Adapted Text-Guided Panoramic Environment Generation for Vision-and-Language Navigation

Add code
Mar 13, 2025
Viaarxiv icon