Picture for Berlin Chen

Berlin Chen

MuFFIN: Multifaceted Pronunciation Feedback Model with Interactive Hierarchical Neural Modeling

Add code
Oct 06, 2025
Viaarxiv icon

Session-Level Spoken Language Assessment with a Multimodal Foundation Model via Multi-Target Learning

Add code
Sep 19, 2025
Viaarxiv icon

Beyond Modality Limitations: A Unified MLLM Approach to Automated Speaking Assessment with Effective Curriculum Learning

Add code
Aug 18, 2025
Viaarxiv icon

QAMRO: Quality-aware Adaptive Margin Ranking Optimization for Human-aligned Assessment of Audio Generation Systems

Add code
Aug 12, 2025
Viaarxiv icon

Revealing the Role of Audio Channels in ASR Performance Degradation

Add code
Aug 12, 2025
Figure 1 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 2 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 3 for Revealing the Role of Audio Channels in ASR Performance Degradation
Figure 4 for Revealing the Role of Audio Channels in ASR Performance Degradation
Viaarxiv icon

JCAPT: A Joint Modeling Approach for CAPT

Add code
Jun 24, 2025
Viaarxiv icon

The NTNU System at the S&I Challenge 2025 SLA Open Track

Add code
Jun 05, 2025
Viaarxiv icon

Acoustically Precise Hesitation Tagging Is Essential for End-to-End Verbatim Transcription Systems

Add code
Jun 04, 2025
Viaarxiv icon

A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions

Add code
Jun 04, 2025
Figure 1 for A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions
Figure 2 for A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions
Figure 3 for A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions
Figure 4 for A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions
Viaarxiv icon

Long-Context State-Space Video World Models

Add code
May 26, 2025
Viaarxiv icon