Alert button

"speech": models, code, and papers
Alert button

Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers

Add code
Bookmark button
Alert button
Jul 08, 2021
Huahuan Zheng, Wenjie Peng, Zhijian Ou, Jinsong Zhang

Figure 1 for Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Figure 2 for Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Figure 3 for Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Figure 4 for Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Viaarxiv icon

Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition

Dec 26, 2021
Ismail Shahin, Noor Hindawi, Ali Bou Nassif, Adi Alhudhaif, Kemal Polat

Figure 1 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 2 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 3 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Figure 4 for Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Viaarxiv icon

Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam

Add code
Bookmark button
Alert button
Jan 23, 2020
Marc Delcroix, Tsubasa Ochiai, Katerina Zmolikova, Keisuke Kinoshita, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki

Figure 1 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 2 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 3 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 4 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Viaarxiv icon

Evaluating MT Systems: A Theoretical Framework

Feb 11, 2022
Rajeev Sangal

Viaarxiv icon

Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?

Add code
Bookmark button
Alert button
Jun 02, 2021
Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi

Figure 1 for Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?
Figure 2 for Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?
Figure 3 for Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?
Figure 4 for Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?
Viaarxiv icon

Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise

Add code
Bookmark button
Alert button
Apr 28, 2020
Shan Yang, Yuxuan Wang, Lei Xie

Figure 1 for Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise
Figure 2 for Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise
Figure 3 for Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise
Viaarxiv icon

Optimizing Speech Emotion Recognition using Manta-Ray Based Feature Selection

Sep 18, 2020
Soham Chattopadhyay, Arijit Dey, Hritam Basak

Figure 1 for Optimizing Speech Emotion Recognition using Manta-Ray Based Feature Selection
Figure 2 for Optimizing Speech Emotion Recognition using Manta-Ray Based Feature Selection
Figure 3 for Optimizing Speech Emotion Recognition using Manta-Ray Based Feature Selection
Figure 4 for Optimizing Speech Emotion Recognition using Manta-Ray Based Feature Selection
Viaarxiv icon

PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss

Aug 11, 2020
Umut Isik, Ritwik Giri, Neerad Phansalkar, Jean-Marc Valin, Karim Helwani, Arvindh Krishnaswamy

Figure 1 for PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss
Figure 2 for PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss
Figure 3 for PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss
Figure 4 for PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss
Viaarxiv icon

An Asynchronous WFST-Based Decoder For Automatic Speech Recognition

Add code
Bookmark button
Alert button
Mar 16, 2021
Hang Lv, Zhehuai Chen, Hainan Xu, Daniel Povey, Lei Xie, Sanjeev Khudanpur

Figure 1 for An Asynchronous WFST-Based Decoder For Automatic Speech Recognition
Figure 2 for An Asynchronous WFST-Based Decoder For Automatic Speech Recognition
Figure 3 for An Asynchronous WFST-Based Decoder For Automatic Speech Recognition
Figure 4 for An Asynchronous WFST-Based Decoder For Automatic Speech Recognition
Viaarxiv icon

Multilingual and Multi-Aspect Hate Speech Analysis

Add code
Bookmark button
Alert button
Aug 29, 2019
Nedjma Ousidhoum, Zizheng Lin, Hongming Zhang, Yangqiu Song, Dit-Yan Yeung

Figure 1 for Multilingual and Multi-Aspect Hate Speech Analysis
Figure 2 for Multilingual and Multi-Aspect Hate Speech Analysis
Figure 3 for Multilingual and Multi-Aspect Hate Speech Analysis
Figure 4 for Multilingual and Multi-Aspect Hate Speech Analysis
Viaarxiv icon