Picture for Mingshu Chen

Mingshu Chen

MOSS-TTSD: Text to Spoken Dialogue Generation

Add code
Mar 20, 2026
Viaarxiv icon

MOSS-TTS Technical Report

Add code
Mar 18, 2026
Viaarxiv icon

MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models

Add code
Feb 12, 2026
Viaarxiv icon

MOVA: Towards Scalable and Synchronized Video-Audio Generation

Add code
Feb 09, 2026
Viaarxiv icon

MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance

Add code
Oct 02, 2025
Figure 1 for MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
Figure 2 for MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
Figure 3 for MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
Figure 4 for MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance
Viaarxiv icon

DynOPETs: A Versatile Benchmark for Dynamic Object Pose Estimation and Tracking in Moving Camera Scenarios

Add code
Mar 25, 2025
Viaarxiv icon