Picture for Yixuan Zhou

Yixuan Zhou

"In This Environment, As That Speaker": A Text-Driven Framework for Multi-Attribute Speech Conversion

Add code
Jun 08, 2025
Viaarxiv icon

Probe by Gaming: A Game-based Benchmark for Assessing Conceptual Knowledge in LLMs

Add code
May 23, 2025
Viaarxiv icon

UTTG_ A Universal Teleoperation Approach via Online Trajectory Generation

Add code
Apr 28, 2025
Viaarxiv icon

DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models

Add code
Feb 27, 2025
Viaarxiv icon

The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024

Add code
Dec 02, 2024
Figure 1 for The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Figure 2 for The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Figure 3 for The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Figure 4 for The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Viaarxiv icon

SongCreator: Lyrics-based Universal Song Generation

Add code
Sep 09, 2024
Figure 1 for SongCreator: Lyrics-based Universal Song Generation
Figure 2 for SongCreator: Lyrics-based Universal Song Generation
Figure 3 for SongCreator: Lyrics-based Universal Song Generation
Figure 4 for SongCreator: Lyrics-based Universal Song Generation
Viaarxiv icon

VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization

Add code
Sep 02, 2024
Viaarxiv icon

VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling

Add code
Aug 28, 2024
Figure 1 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 2 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 3 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 4 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Viaarxiv icon

Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models

Add code
Jul 18, 2024
Viaarxiv icon

The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge

Add code
Apr 25, 2024
Viaarxiv icon