Alert button
Picture for Samuele Cornell

Samuele Cornell

Alert button

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

Oct 27, 2023
Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis

Figure 1 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 2 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 3 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 4 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Viaarxiv icon

One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition

Oct 02, 2023
Samuele Cornell, Jee-weon Jung, Shinji Watanabe, Stefano Squartini

Figure 1 for One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
Figure 2 for One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
Figure 3 for One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
Viaarxiv icon

A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment

Jul 28, 2023
Carlo Aironi, Samuele Cornell, Luca Serafini, Stefano Squartini

Figure 1 for A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment
Figure 2 for A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment
Figure 3 for A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment
Figure 4 for A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment
Viaarxiv icon

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation

Jul 23, 2023
Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe

Figure 1 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Figure 2 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Figure 3 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Viaarxiv icon

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

Jul 14, 2023
Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Matthew Maciejewski, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur

Figure 1 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 2 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 3 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 4 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Viaarxiv icon

An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings

May 29, 2023
Luca Serafini, Samuele Cornell, Giovanni Morrone, Enrico Zovato, Alessio Brutti, Stefano Squartini

Figure 1 for An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
Figure 2 for An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
Figure 3 for An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
Figure 4 for An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
Viaarxiv icon

Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling

Apr 18, 2023
Zhong-Qiu Wang, Samuele Cornell, Shukjae Choi, Younglo Lee, Byeong-Yeol Kim, Shinji Watanabe

Figure 1 for Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling
Figure 2 for Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling
Figure 3 for Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling
Figure 4 for Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling
Viaarxiv icon

End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations

Mar 21, 2023
Giovanni Morrone, Samuele Cornell, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini

Figure 1 for End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations
Figure 2 for End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations
Figure 3 for End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations
Figure 4 for End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations
Viaarxiv icon

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge

Feb 15, 2023
Samuele Cornell, Zhong-Qiu Wang, Yoshiki Masuyama, Shinji Watanabe, Manuel Pariente, Nobutaka Ono

Figure 1 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Figure 2 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Figure 3 for Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge
Viaarxiv icon