Picture for Chin-Hui Lee

Chin-Hui Lee

Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture

Add code
Sep 17, 2023
Figure 1 for Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Figure 2 for Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Figure 3 for Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Figure 4 for Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Viaarxiv icon

Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints

Add code
Sep 16, 2023
Figure 1 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 2 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 3 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 4 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Viaarxiv icon

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction

Add code
Sep 15, 2023
Figure 1 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 2 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 3 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 4 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Viaarxiv icon

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

Add code
Aug 28, 2023
Figure 1 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 2 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 3 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 4 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Viaarxiv icon

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder

Add code
Aug 14, 2023
Figure 1 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 2 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 3 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 4 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Viaarxiv icon

Semi-supervised multi-channel speaker diarization with cross-channel attention

Add code
Jul 17, 2023
Figure 1 for Semi-supervised multi-channel speaker diarization with cross-channel attention
Figure 2 for Semi-supervised multi-channel speaker diarization with cross-channel attention
Figure 3 for Semi-supervised multi-channel speaker diarization with cross-channel attention
Figure 4 for Semi-supervised multi-channel speaker diarization with cross-channel attention
Viaarxiv icon

Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement

Add code
Jun 14, 2023
Figure 1 for Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Figure 2 for Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Figure 3 for Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Figure 4 for Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Viaarxiv icon

A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models

Add code
Jun 01, 2023
Figure 1 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 2 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 3 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 4 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Viaarxiv icon

A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition

Add code
Nov 02, 2022
Viaarxiv icon

Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function

Add code
Oct 26, 2022
Viaarxiv icon