Picture for Ruijie Tao

Ruijie Tao

Interpolating Speaker Identities in Embedding Space for Data Expansion

Add code
Aug 26, 2025
Figure 1 for Interpolating Speaker Identities in Embedding Space for Data Expansion
Figure 2 for Interpolating Speaker Identities in Embedding Space for Data Expansion
Figure 3 for Interpolating Speaker Identities in Embedding Space for Data Expansion
Figure 4 for Interpolating Speaker Identities in Embedding Space for Data Expansion
Viaarxiv icon

Unified Audio Event Detection

Add code
Sep 13, 2024
Figure 1 for Unified Audio Event Detection
Figure 2 for Unified Audio Event Detection
Figure 3 for Unified Audio Event Detection
Figure 4 for Unified Audio Event Detection
Viaarxiv icon

Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization

Add code
Jul 25, 2024
Figure 1 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 2 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 3 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 4 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Viaarxiv icon

A Benchmark for Multi-speaker Anonymization

Add code
Jul 08, 2024
Figure 1 for A Benchmark for Multi-speaker Anonymization
Figure 2 for A Benchmark for Multi-speaker Anonymization
Figure 3 for A Benchmark for Multi-speaker Anonymization
Figure 4 for A Benchmark for Multi-speaker Anonymization
Viaarxiv icon

Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection

Add code
Jun 25, 2024
Viaarxiv icon

Target Speech Diarization with Multimodal Prompts

Add code
Jun 11, 2024
Figure 1 for Target Speech Diarization with Multimodal Prompts
Figure 2 for Target Speech Diarization with Multimodal Prompts
Figure 3 for Target Speech Diarization with Multimodal Prompts
Figure 4 for Target Speech Diarization with Multimodal Prompts
Viaarxiv icon

How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?

Add code
Jun 04, 2024
Viaarxiv icon

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention

Add code
Apr 29, 2024
Figure 1 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 2 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 3 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 4 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Viaarxiv icon

Voice Conversion Augmentation for Speaker Recognition on Defective Datasets

Add code
Apr 01, 2024
Figure 1 for Voice Conversion Augmentation for Speaker Recognition on Defective Datasets
Figure 2 for Voice Conversion Augmentation for Speaker Recognition on Defective Datasets
Figure 3 for Voice Conversion Augmentation for Speaker Recognition on Defective Datasets
Figure 4 for Voice Conversion Augmentation for Speaker Recognition on Defective Datasets
Viaarxiv icon

Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training

Add code
Apr 01, 2024
Figure 1 for Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training
Figure 2 for Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training
Figure 3 for Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training
Figure 4 for Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training
Viaarxiv icon