Alert button
Picture for Sabato Marco Siniscalchi

Sabato Marco Siniscalchi

Alert button

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

Feb 08, 2024
Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, Ensiong Chng, Chao-Han Huck Yang

Viaarxiv icon

Bayesian adaptive learning to latent variables via Variational Bayes and Maximum a Posteriori

Jan 24, 2024
Hu Hu, Sabato Marco Siniscalchi, Chin-Hui Lee

Viaarxiv icon

Generative error correction for code-switching speech recognition using large language models

Oct 17, 2023
Chen Chen, Yuchen Hu, Chao-Han Huck Yang, Hexin Liu, Sabato Marco Siniscalchi, Eng Siong Chng

Figure 1 for Generative error correction for code-switching speech recognition using large language models
Figure 2 for Generative error correction for code-switching speech recognition using large language models
Figure 3 for Generative error correction for code-switching speech recognition using large language models
Figure 4 for Generative error correction for code-switching speech recognition using large language models
Viaarxiv icon

Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints

Sep 16, 2023
Hao Yen, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 2 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 3 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Figure 4 for Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Viaarxiv icon

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction

Sep 15, 2023
Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, Jianqing Gao

Figure 1 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 2 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 3 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Figure 4 for The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction
Viaarxiv icon

S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction

Jul 13, 2023
Mohammad Adiban, Kalin Stefanov, Sabato Marco Siniscalchi, Giampiero Salvi

Figure 1 for S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction
Figure 2 for S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction
Figure 3 for S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction
Figure 4 for S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction
Viaarxiv icon

How word semantics and phonology affect handwriting of Alzheimer's patients: a machine learning based analysis

Jul 06, 2023
Nicole Dalia Cilia, Claudio De Stefano, Francesco Fontanella, Sabato Marco Siniscalchi

Figure 1 for How word semantics and phonology affect handwriting of Alzheimer's patients: a machine learning based analysis
Figure 2 for How word semantics and phonology affect handwriting of Alzheimer's patients: a machine learning based analysis
Figure 3 for How word semantics and phonology affect handwriting of Alzheimer's patients: a machine learning based analysis
Figure 4 for How word semantics and phonology affect handwriting of Alzheimer's patients: a machine learning based analysis
Viaarxiv icon

A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models

Jun 01, 2023
Pin-Jui Ku, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 2 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 3 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 4 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Viaarxiv icon

Differentially Private Adapters for Parameter Efficient Acoustic Modeling

May 19, 2023
Chun-Wei Ho, Chao-Han Huck Yang, Sabato Marco Siniscalchi

Figure 1 for Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Figure 2 for Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Figure 3 for Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Figure 4 for Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Viaarxiv icon

A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition

Nov 02, 2022
Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
Figure 2 for A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
Figure 3 for A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
Figure 4 for A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition
Viaarxiv icon