Alert button
Picture for Chin-Hui Lee

Chin-Hui Lee

Alert button

A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition

Add code
Bookmark button
Alert button
Nov 02, 2022
Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
Figure 2 for A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
Figure 3 for A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
Figure 4 for A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
Viaarxiv icon

Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function

Add code
Bookmark button
Alert button
Oct 26, 2022
Qing Wang, Hang Chen, Ya Jiang, Zhe Wang, Yuyang Wang, Jun Du, Chin-Hui Lee

Figure 1 for Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
Figure 2 for Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
Figure 3 for Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
Figure 4 for Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
Viaarxiv icon

An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Oct 13, 2022
Chao-Han Huck Yang, I-Fan Chen, Andreas Stolcke, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
Figure 2 for An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
Figure 3 for An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
Figure 4 for An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
Viaarxiv icon

An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition

Add code
Bookmark button
Alert button
Oct 12, 2022
Chao-Han Huck Yang, Jun Qi, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
Figure 2 for An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
Figure 3 for An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
Figure 4 for An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
Viaarxiv icon

A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

Add code
Bookmark button
Alert button
Mar 31, 2022
Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee

Figure 1 for A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification
Figure 2 for A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification
Figure 3 for A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification
Figure 4 for A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification
Viaarxiv icon

A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning

Add code
Bookmark button
Alert button
Feb 17, 2022
Hengshun Zhou, Jun Du, Chao-Han Huck Yang, Shifu Xiong, Chin-Hui Lee

Figure 1 for A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning
Figure 2 for A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning
Figure 3 for A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning
Figure 4 for A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning
Viaarxiv icon

The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge

Add code
Bookmark button
Alert button
Feb 10, 2022
Maokui He, Xiang Lv, Weilin Zhou, JingJing Yin, Xiaoqi Zhang, Yuxuan Wang, Shutong Niu, Yuhang Cao, Heng Lu, Jun Du, Chin-Hui Lee

Figure 1 for The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge
Figure 2 for The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge
Figure 3 for The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge
Figure 4 for The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge
Viaarxiv icon

Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition

Add code
Bookmark button
Alert button
Nov 17, 2021
Hengshun Zhou, Jun Du, Yuanyuan Zhang, Qing Wang, Qing-Feng Liu, Chin-Hui Lee

Figure 1 for Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition
Figure 2 for Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition
Figure 3 for Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition
Figure 4 for Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition
Viaarxiv icon