Picture for Dongchao Yang

Dongchao Yang

MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark

Add code
Jun 05, 2025
Viaarxiv icon

SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline

Add code
May 25, 2025
Viaarxiv icon

Kimi-Audio Technical Report

Add code
Apr 25, 2025
Viaarxiv icon

UniSep: Universal Target Audio Separation with Language Models at Scale

Add code
Mar 31, 2025
Viaarxiv icon

MoonCast: High-Quality Zero-Shot Podcast Generation

Add code
Mar 19, 2025
Viaarxiv icon

InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training

Add code
Mar 04, 2025
Viaarxiv icon

Audio-FLAN: A Preliminary Release

Add code
Feb 23, 2025
Viaarxiv icon

ATRI: Mitigating Multilingual Audio Text Retrieval Inconsistencies by Reducing Data Distribution Errors

Add code
Feb 22, 2025
Viaarxiv icon

A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models

Add code
Nov 13, 2024
Viaarxiv icon

Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models

Add code
Sep 21, 2024
Figure 1 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 2 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 3 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 4 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Viaarxiv icon