Picture for Ge Zhu

Ge Zhu

MusicHiFi: Fast High-Fidelity Stereo Vocoding

Add code
Mar 20, 2024
Figure 1 for MusicHiFi: Fast High-Fidelity Stereo Vocoding
Figure 2 for MusicHiFi: Fast High-Fidelity Stereo Vocoding
Figure 3 for MusicHiFi: Fast High-Fidelity Stereo Vocoding
Figure 4 for MusicHiFi: Fast High-Fidelity Stereo Vocoding
Viaarxiv icon

Cacophony: An Improved Contrastive Audio-Text Model

Add code
Feb 10, 2024
Viaarxiv icon

EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis

Add code
Nov 18, 2023
Figure 1 for EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Figure 2 for EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Figure 3 for EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Figure 4 for EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Viaarxiv icon

Transcription free filler word detection with Neural semi-CRFs

Add code
Mar 11, 2023
Figure 1 for Transcription free filler word detection with Neural semi-CRFs
Figure 2 for Transcription free filler word detection with Neural semi-CRFs
Figure 3 for Transcription free filler word detection with Neural semi-CRFs
Viaarxiv icon

Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics

Add code
Jan 18, 2023
Figure 1 for Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics
Figure 2 for Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics
Figure 3 for Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics
Figure 4 for Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics
Viaarxiv icon

Music Source Separation with Generative Flow

Add code
Apr 26, 2022
Figure 1 for Music Source Separation with Generative Flow
Figure 2 for Music Source Separation with Generative Flow
Figure 3 for Music Source Separation with Generative Flow
Viaarxiv icon

Filler Word Detection and Classification: A Dataset and Benchmark

Add code
Mar 28, 2022
Figure 1 for Filler Word Detection and Classification: A Dataset and Benchmark
Figure 2 for Filler Word Detection and Classification: A Dataset and Benchmark
Figure 3 for Filler Word Detection and Classification: A Dataset and Benchmark
Figure 4 for Filler Word Detection and Classification: A Dataset and Benchmark
Viaarxiv icon

A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification

Add code
Mar 02, 2022
Figure 1 for A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Figure 2 for A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Figure 3 for A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Figure 4 for A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Viaarxiv icon

A study of the robustness of raw waveform based speaker embeddings under mismatched conditions

Add code
Oct 11, 2021
Figure 1 for A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
Figure 2 for A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
Figure 3 for A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
Figure 4 for A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
Viaarxiv icon

UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021

Add code
Aug 23, 2021
Figure 1 for UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
Figure 2 for UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
Figure 3 for UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
Figure 4 for UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
Viaarxiv icon