Picture for Jiaming Zhou

Jiaming Zhou

StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling

Add code
Jun 14, 2025
Viaarxiv icon

RA-CLAP: Relation-Augmented Emotional Speaking Style Contrastive Language-Audio Pretraining For Speech Retrieval

Add code
May 26, 2025
Viaarxiv icon

Omni-Perception: Omnidirectional Collision Avoidance for Legged Locomotion in Dynamic Environments

Add code
May 25, 2025
Viaarxiv icon

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization

Add code
May 21, 2025
Viaarxiv icon

Reinforcing Question Answering Agents with Minimalist Policy Gradient Optimization

Add code
May 20, 2025
Viaarxiv icon

GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation

Add code
May 17, 2025
Viaarxiv icon

Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides

Add code
Apr 21, 2025
Viaarxiv icon

SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors

Add code
Mar 20, 2025
Viaarxiv icon

CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition

Add code
Feb 26, 2025
Viaarxiv icon

MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation

Add code
Jan 18, 2025
Figure 1 for MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation
Figure 2 for MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation
Figure 3 for MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation
Figure 4 for MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation
Viaarxiv icon