Alert button

"speech": models, code, and papers
Alert button

LLaSM: Large Language and Speech Model

Add code
Bookmark button
Alert button
Aug 30, 2023
Yu Shu, Siwei Dong, Guangyao Chen, Wenhao Huang, Ruihua Zhang, Daochen Shi, Qiqi Xiang, Yemin Shi

Figure 1 for LLaSM: Large Language and Speech Model
Figure 2 for LLaSM: Large Language and Speech Model
Figure 3 for LLaSM: Large Language and Speech Model
Figure 4 for LLaSM: Large Language and Speech Model
Viaarxiv icon

Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting

Sep 15, 2023
Tiantian Feng, Shrikanth Narayanan

Figure 1 for Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Figure 2 for Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Figure 3 for Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Figure 4 for Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Viaarxiv icon

Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring

Oct 17, 2023
Ankitha Sudarshan, Vinay Samuel, Parth Patwa, Ibtihel Amara, Aman Chadha

Figure 1 for Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring
Figure 2 for Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring
Figure 3 for Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring
Figure 4 for Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring
Viaarxiv icon

Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization

Sep 27, 2023
Amir Hussein, Brian Yan, Antonios Anastasopoulos, Shinji Watanabe, Sanjeev Khudanpur

Figure 1 for Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
Figure 2 for Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
Figure 3 for Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
Figure 4 for Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
Viaarxiv icon

GRASS: Unified Generation Model for Speech Semantic Understanding

Add code
Bookmark button
Alert button
Sep 06, 2023
Aobo Xia, Shuyu Lei, Yushu Yang, Xiang Guo, Hua Chai

Figure 1 for GRASS: Unified Generation Model for Speech Semantic Understanding
Figure 2 for GRASS: Unified Generation Model for Speech Semantic Understanding
Figure 3 for GRASS: Unified Generation Model for Speech Semantic Understanding
Viaarxiv icon

Soft Random Sampling: A Theoretical and Empirical Analysis

Nov 24, 2023
Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei Zhang, George Saon, Brian Kingsbury

Viaarxiv icon

Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface

Add code
Bookmark button
Alert button
Oct 06, 2023
Anupam Purwar, Rahul Sundar

Figure 1 for Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Figure 2 for Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Figure 3 for Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Figure 4 for Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Viaarxiv icon

Utilizing Whisper to Enhance Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids

Sep 18, 2023
Ryandhimas E. Zezario, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao

Figure 1 for Utilizing Whisper to Enhance Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Figure 2 for Utilizing Whisper to Enhance Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Figure 3 for Utilizing Whisper to Enhance Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Figure 4 for Utilizing Whisper to Enhance Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Viaarxiv icon

RoDia: A New Dataset for Romanian Dialect Identification from Speech

Add code
Bookmark button
Alert button
Sep 12, 2023
Codrut Rotaru, Nicolae-Catalin Ristea, Radu Tudor Ionescu

Figure 1 for RoDia: A New Dataset for Romanian Dialect Identification from Speech
Figure 2 for RoDia: A New Dataset for Romanian Dialect Identification from Speech
Figure 3 for RoDia: A New Dataset for Romanian Dialect Identification from Speech
Figure 4 for RoDia: A New Dataset for Romanian Dialect Identification from Speech
Viaarxiv icon

Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model

Add code
Bookmark button
Alert button
Sep 20, 2023
Xinyu Zhou, Delong Chen, Yudong Chen

Figure 1 for Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model
Figure 2 for Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model
Figure 3 for Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model
Figure 4 for Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model
Viaarxiv icon