Alert button

"speech recognition": models, code, and papers
Alert button

Learning to Count Words in Fluent Speech enables Online Speech Recognition

Add code
Bookmark button
Alert button
Jun 11, 2020
George Sterpu, Christian Saam, Naomi Harte

Figure 1 for Learning to Count Words in Fluent Speech enables Online Speech Recognition
Figure 2 for Learning to Count Words in Fluent Speech enables Online Speech Recognition
Figure 3 for Learning to Count Words in Fluent Speech enables Online Speech Recognition
Figure 4 for Learning to Count Words in Fluent Speech enables Online Speech Recognition
Viaarxiv icon

RT-1: Robotics Transformer for Real-World Control at Scale

Add code
Bookmark button
Alert button
Dec 13, 2022
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

Figure 1 for RT-1: Robotics Transformer for Real-World Control at Scale
Figure 2 for RT-1: Robotics Transformer for Real-World Control at Scale
Figure 3 for RT-1: Robotics Transformer for Real-World Control at Scale
Figure 4 for RT-1: Robotics Transformer for Real-World Control at Scale
Viaarxiv icon

Transformer Based Deliberation for Two-Pass Speech Recognition

Jan 27, 2021
Ke Hu, Ruoming Pang, Tara N. Sainath, Trevor Strohman

Figure 1 for Transformer Based Deliberation for Two-Pass Speech Recognition
Figure 2 for Transformer Based Deliberation for Two-Pass Speech Recognition
Figure 3 for Transformer Based Deliberation for Two-Pass Speech Recognition
Figure 4 for Transformer Based Deliberation for Two-Pass Speech Recognition
Viaarxiv icon

Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition

Oct 07, 2021
Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland

Figure 1 for Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition
Figure 2 for Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition
Figure 3 for Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition
Figure 4 for Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition
Viaarxiv icon

Exploring WavLM on Speech Enhancement

Nov 18, 2022
Hyungchan Song, Sanyuan Chen, Zhuo Chen, Yu Wu, Takuya Yoshioka, Min Tang, Jong Won Shin, Shujie Liu

Figure 1 for Exploring WavLM on Speech Enhancement
Figure 2 for Exploring WavLM on Speech Enhancement
Figure 3 for Exploring WavLM on Speech Enhancement
Viaarxiv icon

Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem

Oct 28, 2022
Sebastian P. Bayerl, Dominik Wagner, Florian Hönig, Tobias Bocklet, Elmar Nöth, Korbinian Riedhammer

Figure 1 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Figure 2 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Figure 3 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Viaarxiv icon

Neural Architecture Search for Speech Emotion Recognition

Mar 31, 2022
Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng

Figure 1 for Neural Architecture Search for Speech Emotion Recognition
Figure 2 for Neural Architecture Search for Speech Emotion Recognition
Figure 3 for Neural Architecture Search for Speech Emotion Recognition
Figure 4 for Neural Architecture Search for Speech Emotion Recognition
Viaarxiv icon

Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models

Dec 03, 2022
Reem Gody, David Harwath

Figure 1 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 2 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 3 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 4 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Viaarxiv icon

Device Directedness with Contextual Cues for Spoken Dialog Systems

Nov 23, 2022
Dhanush Bekal, Sundararajan Srinivasan, Sravan Bodapati, Srikanth Ronanki, Katrin Kirchhoff

Figure 1 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 2 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 3 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 4 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Viaarxiv icon

Simulating realistic speech overlaps improves multi-talker ASR

Nov 17, 2022
Muqiao Yang, Naoyuki Kanda, Xiaofei Wang, Jian Wu, Sunit Sivasankaran, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Figure 1 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 2 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 3 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 4 for Simulating realistic speech overlaps improves multi-talker ASR
Viaarxiv icon