Picture for Hui Wang

Hui Wang

Queen's University Belfast, UK

A survey of using EHR as real-world evidence for discovering and validating new drug indications

Add code
May 30, 2025
Viaarxiv icon

RA-CLAP: Relation-Augmented Emotional Speaking Style Contrastive Language-Audio Pretraining For Speech Retrieval

Add code
May 26, 2025
Viaarxiv icon

Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling

Add code
May 26, 2025
Viaarxiv icon

Enhancing Generalization of Speech Large Language Models with Multi-Task Behavior Imitation and Speech-Text Interleaving

Add code
May 24, 2025
Viaarxiv icon

CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training

Add code
May 23, 2025
Viaarxiv icon

Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization

Add code
May 20, 2025
Figure 1 for Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
Figure 2 for Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
Figure 3 for Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
Figure 4 for Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
Viaarxiv icon

Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides

Add code
Apr 21, 2025
Viaarxiv icon

From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs

Add code
Apr 18, 2025
Viaarxiv icon

LightFormer: A lightweight and efficient decoder for remote sensing image segmentation

Add code
Apr 15, 2025
Figure 1 for LightFormer: A lightweight and efficient decoder for remote sensing image segmentation
Figure 2 for LightFormer: A lightweight and efficient decoder for remote sensing image segmentation
Figure 3 for LightFormer: A lightweight and efficient decoder for remote sensing image segmentation
Figure 4 for LightFormer: A lightweight and efficient decoder for remote sensing image segmentation
Viaarxiv icon

Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis

Add code
Apr 14, 2025
Figure 1 for Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Figure 2 for Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Figure 3 for Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Figure 4 for Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Viaarxiv icon