Picture for Yanzhen Ren

Yanzhen Ren

Audio-visual Event Localization on Portrait Mode Short Videos

Add code
Apr 09, 2025
Viaarxiv icon

SE4Lip: Speech-Lip Encoder for Talking Head Synthesis to Solve Phoneme-Viseme Alignment Ambiguity

Add code
Apr 08, 2025
Viaarxiv icon

Improving Speech Enhancement by Cross- and Sub-band Processing with State Space Model

Add code
Feb 22, 2025
Viaarxiv icon

FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder

Add code
Jul 05, 2024
Viaarxiv icon

Semantic Proximity Alignment: Towards Human Perception-consistent Audio Tagging by Aligning with Label Text Description

Add code
Sep 28, 2023
Figure 1 for Semantic Proximity Alignment: Towards Human Perception-consistent Audio Tagging by Aligning with Label Text Description
Figure 2 for Semantic Proximity Alignment: Towards Human Perception-consistent Audio Tagging by Aligning with Label Text Description
Figure 3 for Semantic Proximity Alignment: Towards Human Perception-consistent Audio Tagging by Aligning with Label Text Description
Figure 4 for Semantic Proximity Alignment: Towards Human Perception-consistent Audio Tagging by Aligning with Label Text Description
Viaarxiv icon

A Snoring Sound Dataset for Body Position Recognition: Collection, Annotation, and Analysis

Add code
Jul 25, 2023
Figure 1 for A Snoring Sound Dataset for Body Position Recognition: Collection, Annotation, and Analysis
Figure 2 for A Snoring Sound Dataset for Body Position Recognition: Collection, Annotation, and Analysis
Figure 3 for A Snoring Sound Dataset for Body Position Recognition: Collection, Annotation, and Analysis
Figure 4 for A Snoring Sound Dataset for Body Position Recognition: Collection, Annotation, and Analysis
Viaarxiv icon

Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion

Add code
May 09, 2023
Figure 1 for Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion
Figure 2 for Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion
Figure 3 for Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion
Figure 4 for Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion
Viaarxiv icon

Hiding Data in Colors: Secure and Lossless Deep Image Steganography via Conditional Invertible Neural Networks

Add code
Jan 19, 2022
Figure 1 for Hiding Data in Colors: Secure and Lossless Deep Image Steganography via Conditional Invertible Neural Networks
Figure 2 for Hiding Data in Colors: Secure and Lossless Deep Image Steganography via Conditional Invertible Neural Networks
Figure 3 for Hiding Data in Colors: Secure and Lossless Deep Image Steganography via Conditional Invertible Neural Networks
Figure 4 for Hiding Data in Colors: Secure and Lossless Deep Image Steganography via Conditional Invertible Neural Networks
Viaarxiv icon

Generalized Local Optimality for Video Steganalysis in Motion Vector Domain

Add code
Dec 22, 2021
Figure 1 for Generalized Local Optimality for Video Steganalysis in Motion Vector Domain
Figure 2 for Generalized Local Optimality for Video Steganalysis in Motion Vector Domain
Figure 3 for Generalized Local Optimality for Video Steganalysis in Motion Vector Domain
Figure 4 for Generalized Local Optimality for Video Steganalysis in Motion Vector Domain
Viaarxiv icon

Using contrastive learning to improve the performance of steganalysis schemes

Add code
Mar 01, 2021
Figure 1 for Using contrastive learning to improve the performance of steganalysis schemes
Figure 2 for Using contrastive learning to improve the performance of steganalysis schemes
Figure 3 for Using contrastive learning to improve the performance of steganalysis schemes
Figure 4 for Using contrastive learning to improve the performance of steganalysis schemes
Viaarxiv icon