Picture for Yixuan Zhou

Yixuan Zhou

EgoLoc: A Generalizable Solution for Temporal Interaction Localization in Egocentric Videos

Add code
Aug 17, 2025
Viaarxiv icon

A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understanding

Add code
Aug 07, 2025
Viaarxiv icon

"In This Environment, As That Speaker": A Text-Driven Framework for Multi-Attribute Speech Conversion

Add code
Jun 08, 2025
Viaarxiv icon

Probe by Gaming: A Game-based Benchmark for Assessing Conceptual Knowledge in LLMs

Add code
May 23, 2025
Viaarxiv icon

UTTG_ A Universal Teleoperation Approach via Online Trajectory Generation

Add code
Apr 28, 2025
Viaarxiv icon

DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models

Add code
Feb 27, 2025
Viaarxiv icon

The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024

Add code
Dec 02, 2024
Figure 1 for The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Figure 2 for The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Figure 3 for The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Figure 4 for The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Viaarxiv icon

SongCreator: Lyrics-based Universal Song Generation

Add code
Sep 09, 2024
Figure 1 for SongCreator: Lyrics-based Universal Song Generation
Figure 2 for SongCreator: Lyrics-based Universal Song Generation
Figure 3 for SongCreator: Lyrics-based Universal Song Generation
Figure 4 for SongCreator: Lyrics-based Universal Song Generation
Viaarxiv icon

VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization

Add code
Sep 02, 2024
Figure 1 for VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization
Figure 2 for VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization
Figure 3 for VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization
Figure 4 for VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization
Viaarxiv icon

VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling

Add code
Aug 28, 2024
Figure 1 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 2 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 3 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 4 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Viaarxiv icon