Picture for Yuanjun Lv

Yuanjun Lv

Qwen2-Audio Technical Report

Add code
Jul 15, 2024
Viaarxiv icon

Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy

Add code
Jun 14, 2024
Viaarxiv icon

FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

Add code
Jun 12, 2024
Figure 1 for FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
Figure 2 for FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
Figure 3 for FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
Figure 4 for FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
Viaarxiv icon

Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation

Add code
Jun 11, 2024
Viaarxiv icon

RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention

Add code
Jun 11, 2024
Viaarxiv icon

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Add code
Feb 12, 2024
Viaarxiv icon

RaD-Net: A Repairing and Denoising Network for Speech Signal Improvement

Add code
Jan 09, 2024
Viaarxiv icon

Vec-Tok Speech: speech vectorization and tokenization for neural speech generation

Add code
Oct 12, 2023
Viaarxiv icon

SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation

Add code
Oct 08, 2023
Viaarxiv icon

HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for Expressive Long-form TTS

Add code
Sep 25, 2023
Viaarxiv icon