Alert button

"speech": models, code, and papers
Alert button

Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada

Jul 27, 2022
Madhavaraj A, Bharathi Pilar, Ramakrishnan A G

Figure 1 for Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada
Figure 2 for Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada
Figure 3 for Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada
Figure 4 for Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada
Viaarxiv icon

Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition

Jul 29, 2022
Peng Shen, Xugang Lu, Hisashi Kawai

Figure 1 for Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Figure 2 for Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Figure 3 for Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Figure 4 for Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Viaarxiv icon

Dual Learning for Large Vocabulary On-Device ASR

Jan 11, 2023
Cal Peyser, Ronny Huang, Tara Sainath, Rohit Prabhavalkar, Michael Picheny, Kyunghyun Cho

Figure 1 for Dual Learning for Large Vocabulary On-Device ASR
Figure 2 for Dual Learning for Large Vocabulary On-Device ASR
Figure 3 for Dual Learning for Large Vocabulary On-Device ASR
Figure 4 for Dual Learning for Large Vocabulary On-Device ASR
Viaarxiv icon

Improving And Analyzing Neural Speaker Embeddings for ASR

Jan 11, 2023
Christoph Lüscher, Jingjing Xu, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney

Figure 1 for Improving And Analyzing Neural Speaker Embeddings for ASR
Figure 2 for Improving And Analyzing Neural Speaker Embeddings for ASR
Figure 3 for Improving And Analyzing Neural Speaker Embeddings for ASR
Figure 4 for Improving And Analyzing Neural Speaker Embeddings for ASR
Viaarxiv icon

Acoustically-Driven Phoneme Removal That Preserves Vocal Affect Cues

Oct 26, 2022
Camille Noufi, Jonathan Berger, Michael Frank, Karen Parker, Daniel L. Bowling

Figure 1 for Acoustically-Driven Phoneme Removal That Preserves Vocal Affect Cues
Figure 2 for Acoustically-Driven Phoneme Removal That Preserves Vocal Affect Cues
Figure 3 for Acoustically-Driven Phoneme Removal That Preserves Vocal Affect Cues
Viaarxiv icon

Practical cognitive speech compression

Mar 08, 2022
Reza Lotfidereshgi, Philippe Gournay

Figure 1 for Practical cognitive speech compression
Figure 2 for Practical cognitive speech compression
Figure 3 for Practical cognitive speech compression
Figure 4 for Practical cognitive speech compression
Viaarxiv icon

SLICER: Learning universal audio representations using low-resource self-supervised pre-training

Nov 02, 2022
Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

Figure 1 for SLICER: Learning universal audio representations using low-resource self-supervised pre-training
Figure 2 for SLICER: Learning universal audio representations using low-resource self-supervised pre-training
Figure 3 for SLICER: Learning universal audio representations using low-resource self-supervised pre-training
Figure 4 for SLICER: Learning universal audio representations using low-resource self-supervised pre-training
Viaarxiv icon

ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications

Apr 01, 2022
Gaoxiong Yi, Wei Xiao, Yiming Xiao, Babak Naderi, Sebastian Möller, Wafaa Wardah, Gabriel Mittag, Ross Cutler, Zhuohuang Zhang, Donald S. Williamson, Fei Chen, Fuzheng Yang, Shidong Shang

Figure 1 for ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications
Figure 2 for ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications
Figure 3 for ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications
Figure 4 for ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications
Viaarxiv icon

Lived Experience Matters: Automatic Detection of Stigma toward People Who Use Substances on Social Media

Feb 04, 2023
Salvatore Giorgi, Douglas Bellew, Daniel Roy Sadek Habib, Joao Sedoc, Chase Smitterberg, Amanda Devoto, McKenzie Himelein-Wachowiak, Brenda Curtis

Figure 1 for Lived Experience Matters: Automatic Detection of Stigma toward People Who Use Substances on Social Media
Figure 2 for Lived Experience Matters: Automatic Detection of Stigma toward People Who Use Substances on Social Media
Figure 3 for Lived Experience Matters: Automatic Detection of Stigma toward People Who Use Substances on Social Media
Figure 4 for Lived Experience Matters: Automatic Detection of Stigma toward People Who Use Substances on Social Media
Viaarxiv icon

Joint Far- and Near-End Speech Intelligibility Enhancement based on the Approximated Speech Intelligibility Index

Nov 15, 2021
Andreas Jonas Fuglsig, Jan Østergaard, Jesper Jensen, Lars Søndergaard Bertelsen, Peter Mariager, Zheng-Hua Tan

Figure 1 for Joint Far- and Near-End Speech Intelligibility Enhancement based on the Approximated Speech Intelligibility Index
Figure 2 for Joint Far- and Near-End Speech Intelligibility Enhancement based on the Approximated Speech Intelligibility Index
Figure 3 for Joint Far- and Near-End Speech Intelligibility Enhancement based on the Approximated Speech Intelligibility Index
Viaarxiv icon