Alert button
Picture for Chin-Hui Lee

Chin-Hui Lee

Alert button

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Mar 07, 2024
Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee

Figure 1 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 2 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 3 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 4 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Viaarxiv icon

Bayesian adaptive learning to latent variables via Variational Bayes and Maximum a Posteriori

Add code
Bookmark button
Alert button
Jan 24, 2024
Hu Hu, Sabato Marco Siniscalchi, Chin-Hui Lee

Viaarxiv icon

Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture

Add code
Bookmark button
Alert button
Sep 17, 2023
Gaobin Yang, Maokui He, Shutong Niu, Ruoyu Wang, Yanyan Yue, Shuangqing Qian, Shilong Wu, Jun Du, Chin-Hui Lee

Figure 1 for Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Figure 2 for Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Figure 3 for Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Figure 4 for Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Viaarxiv icon

Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints

Add code
Bookmark button
Alert button
Sep 16, 2023
Hao Yen, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 2 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 3 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 4 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Viaarxiv icon

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction

Add code
Bookmark button
Alert button
Sep 15, 2023
Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, Jianqing Gao

Figure 1 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 2 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 3 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 4 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Viaarxiv icon

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

Add code
Bookmark button
Alert button
Aug 28, 2023
Ruoyu Wang, Maokui He, Jun Du, Hengshun Zhou, Shutong Niu, Hang Chen, Yanyan Yue, Gaobin Yang, Shilong Wu, Lei Sun, Yanhui Tu, Haitao Tang, Shuangqing Qian, Tian Gao, Mengzhi Wang, Genshun Wan, Jia Pan, Jianqing Gao, Chin-Hui Lee

Figure 1 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 2 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 3 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Figure 4 for The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Viaarxiv icon

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder

Add code
Bookmark button
Alert button
Aug 14, 2023
Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee

Figure 1 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 2 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 3 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 4 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Viaarxiv icon

Semi-supervised multi-channel speaker diarization with cross-channel attention

Add code
Bookmark button
Alert button
Jul 17, 2023
Shilong Wu, Jun Du, Maokui He, Shutong Niu, Hang Chen, Haitao Tang, Chin-Hui Lee

Figure 1 for Semi-supervised multi-channel speaker diarization with cross-channel attention
Figure 2 for Semi-supervised multi-channel speaker diarization with cross-channel attention
Figure 3 for Semi-supervised multi-channel speaker diarization with cross-channel attention
Figure 4 for Semi-supervised multi-channel speaker diarization with cross-channel attention
Viaarxiv icon

Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement

Add code
Bookmark button
Alert button
Jun 14, 2023
Zilu Guo, Jun Du, Chin-Hui Lee, Yu Gao, Wenbin Zhang

Figure 1 for Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Figure 2 for Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Figure 3 for Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Figure 4 for Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement
Viaarxiv icon

A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models

Add code
Bookmark button
Alert button
Jun 01, 2023
Pin-Jui Ku, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 2 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 3 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 4 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Viaarxiv icon