Alert button

"speech recognition": models, code, and papers
Alert button

Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models

May 30, 2023
Danilo de Oliveira, Navin Raj Prabhu, Timo Gerkmann

Figure 1 for Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Figure 2 for Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Figure 3 for Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Figure 4 for Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Viaarxiv icon

Perception and Semantic Aware Regularization for Sequential Confidence Calibration

Add code
Bookmark button
Alert button
May 31, 2023
Zhenghua Peng, Yu Luo, Tianshui Chen, Keke Xu, Shuangping Huang

Figure 1 for Perception and Semantic Aware Regularization for Sequential Confidence Calibration
Figure 2 for Perception and Semantic Aware Regularization for Sequential Confidence Calibration
Figure 3 for Perception and Semantic Aware Regularization for Sequential Confidence Calibration
Figure 4 for Perception and Semantic Aware Regularization for Sequential Confidence Calibration
Viaarxiv icon

A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition

Add code
Bookmark button
Alert button
Apr 22, 2023
Orchid Chetia Phukan, Arun Balaji Buduru, Rajesh Sharma

Figure 1 for A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Figure 2 for A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Figure 3 for A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Figure 4 for A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Viaarxiv icon

hierarchical network with decoupled knowledge distillation for speech emotion recognition

Mar 09, 2023
Ziping Zhao, Huan Wang, Haishuai Wang, Bjorn Schuller

Figure 1 for hierarchical network with decoupled knowledge distillation for speech emotion recognition
Figure 2 for hierarchical network with decoupled knowledge distillation for speech emotion recognition
Figure 3 for hierarchical network with decoupled knowledge distillation for speech emotion recognition
Figure 4 for hierarchical network with decoupled knowledge distillation for speech emotion recognition
Viaarxiv icon

A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings

Nov 01, 2022
Mohan Shi, Jie Zhang, Zhihao Du, Fan Yu, Shiliang Zhang, Li-Rong Dai

Figure 1 for A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings
Figure 2 for A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings
Figure 3 for A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings
Figure 4 for A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings
Viaarxiv icon

SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces

Jul 23, 2023
Ivan Vallés-Pérez, Grzegorz Beringer, Piotr Bilinski, Gary Cook, Roberto Barra-Chicote

Figure 1 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 2 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 3 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 4 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Viaarxiv icon

Accented Speech Recognition: A Survey

Apr 21, 2021
Arthur Hinsvark, Natalie Delworth, Miguel Del Rio, Quinten McNamara, Joshua Dong, Ryan Westerman, Michelle Huang, Joseph Palakapilly, Jennifer Drexler, Ilya Pirkin, Nishchal Bhandari, Miguel Jette

Figure 1 for Accented Speech Recognition: A Survey
Viaarxiv icon

Back Translation for Speech-to-text Translation Without Transcripts

Add code
Bookmark button
Alert button
May 15, 2023
Qingkai Fang, Yang Feng

Figure 1 for Back Translation for Speech-to-text Translation Without Transcripts
Figure 2 for Back Translation for Speech-to-text Translation Without Transcripts
Figure 3 for Back Translation for Speech-to-text Translation Without Transcripts
Figure 4 for Back Translation for Speech-to-text Translation Without Transcripts
Viaarxiv icon

Improving Scheduled Sampling for Neural Transducer-based ASR

May 25, 2023
Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura

Figure 1 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 2 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 3 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 4 for Improving Scheduled Sampling for Neural Transducer-based ASR
Viaarxiv icon

Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models

Add code
Bookmark button
Alert button
May 09, 2023
Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka

Figure 1 for Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Figure 2 for Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Figure 3 for Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Figure 4 for Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Viaarxiv icon