Picture for Chenrui Cui

Chenrui Cui

Separate First, Fuse Later: Mitigating Cross-Modal Interference in Audio-Visual LLMs Reasoning with Modality-Specific Chain-of-Thought

Add code
May 11, 2026
Viaarxiv icon

POTSA: A Cross-Lingual Speech Alignment Framework for Low Resource Speech-to-Text Translation

Add code
Nov 12, 2025
Viaarxiv icon

Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module

Add code
Jan 05, 2025
Figure 1 for Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
Figure 2 for Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
Figure 3 for Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
Viaarxiv icon

Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding

Add code
Dec 24, 2024
Figure 1 for Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding
Figure 2 for Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding
Figure 3 for Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding
Viaarxiv icon