Picture for Wenxi Chen

Wenxi Chen

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation

Add code
Apr 22, 2025
Viaarxiv icon

EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting

Add code
Apr 22, 2025
Viaarxiv icon

Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection

Add code
Mar 24, 2025
Viaarxiv icon

URO-Bench: A Comprehensive Benchmark for End-to-End Spoken Dialogue Models

Add code
Feb 25, 2025
Viaarxiv icon

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

Add code
Dec 20, 2024
Viaarxiv icon

DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning

Add code
Oct 12, 2024
Figure 1 for DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Figure 2 for DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Figure 3 for DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Figure 4 for DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Viaarxiv icon

SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs

Add code
Oct 12, 2024
Figure 1 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 2 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 3 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 4 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Viaarxiv icon

ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke

Add code
Jun 17, 2024
Figure 1 for ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke
Figure 2 for ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke
Figure 3 for ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke
Figure 4 for ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke
Viaarxiv icon

EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

Add code
Jun 11, 2024
Viaarxiv icon

Meta-Learning for Fast Adaptation in Intent Inferral on a Robotic Hand Orthosis for Stroke

Add code
Mar 19, 2024
Viaarxiv icon