Picture for Helen Meng

Helen Meng

Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in SpeechLLMs

Add code
Aug 25, 2025
Viaarxiv icon

DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models

Add code
Aug 12, 2025
Viaarxiv icon

Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction

Add code
Jun 11, 2025
Viaarxiv icon

Naturalistic Language-related Movie-Watching fMRI Task for Detecting Neurocognitive Decline and Disorder

Add code
Jun 10, 2025
Viaarxiv icon

MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark

Add code
Jun 05, 2025
Viaarxiv icon

RAG-Zeval: Towards Robust and Interpretable Evaluation on RAG Responses through End-to-End Rule-Guided Reasoning

Add code
May 28, 2025
Viaarxiv icon

On-the-fly Routing for Zero-shot MoE Speaker Adaptation of Speech Foundation Models for Dysarthric Speech Recognition

Add code
May 28, 2025
Viaarxiv icon

$C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction

Add code
Apr 01, 2025
Viaarxiv icon

UniSep: Universal Target Audio Separation with Language Models at Scale

Add code
Mar 31, 2025
Viaarxiv icon

Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs

Add code
Mar 16, 2025
Viaarxiv icon