Alert button

"speech": models, code, and papers
Alert button

Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection

Add code
Bookmark button
Alert button
Nov 20, 2021
Samuele Cornell, Thomas Balestri, Thibaud Sénéchal

Figure 1 for Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection
Figure 2 for Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection
Figure 3 for Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection
Figure 4 for Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection
Viaarxiv icon

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Add code
Bookmark button
Alert button
Apr 21, 2022
Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song, Yuma Shirahata, Hyun-Wook Yoon, Jae-Min Kim, Kentaro Tachibana

Figure 1 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 2 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 3 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 4 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Viaarxiv icon

Mixed Emotion Modelling for Emotional Voice Conversion

Add code
Bookmark button
Alert button
Oct 26, 2022
Kun Zhou, Berrak Sisman, Carlos Busso, Haizhou Li

Figure 1 for Mixed Emotion Modelling for Emotional Voice Conversion
Figure 2 for Mixed Emotion Modelling for Emotional Voice Conversion
Figure 3 for Mixed Emotion Modelling for Emotional Voice Conversion
Figure 4 for Mixed Emotion Modelling for Emotional Voice Conversion
Viaarxiv icon

Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset

Add code
Bookmark button
Alert button
Jan 07, 2022
Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung Shadow Yiu, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

Figure 1 for Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
Figure 2 for Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
Figure 3 for Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
Figure 4 for Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
Viaarxiv icon

RT-1: Robotics Transformer for Real-World Control at Scale

Add code
Bookmark button
Alert button
Dec 13, 2022
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

Figure 1 for RT-1: Robotics Transformer for Real-World Control at Scale
Figure 2 for RT-1: Robotics Transformer for Real-World Control at Scale
Figure 3 for RT-1: Robotics Transformer for Real-World Control at Scale
Figure 4 for RT-1: Robotics Transformer for Real-World Control at Scale
Viaarxiv icon

Musical Speech: A Transformer-based Composition Tool

Add code
Bookmark button
Alert button
Aug 02, 2021
Jason d'Eon, Sri Harsha Dumpala, Chandramouli Shama Sastry, Dani Oore, Sageev Oore

Figure 1 for Musical Speech: A Transformer-based Composition Tool
Figure 2 for Musical Speech: A Transformer-based Composition Tool
Figure 3 for Musical Speech: A Transformer-based Composition Tool
Figure 4 for Musical Speech: A Transformer-based Composition Tool
Viaarxiv icon

Fusing ASR Outputs in Joint Training for Speech Emotion Recognition

Oct 29, 2021
Yuanchao Li, Peter Bell, Catherine Lai

Figure 1 for Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
Figure 2 for Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
Figure 3 for Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
Figure 4 for Fusing ASR Outputs in Joint Training for Speech Emotion Recognition
Viaarxiv icon

DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction

Add code
Bookmark button
Alert button
Dec 27, 2021
Jiangyu Han, Yanhua Long, Lukas Burget, Jan Cernocky

Figure 1 for DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction
Figure 2 for DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction
Figure 3 for DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction
Figure 4 for DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction
Viaarxiv icon

Pseudo-Labeling for Massively Multilingual Speech Recognition

Add code
Bookmark button
Alert button
Oct 30, 2021
Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Figure 1 for Pseudo-Labeling for Massively Multilingual Speech Recognition
Figure 2 for Pseudo-Labeling for Massively Multilingual Speech Recognition
Figure 3 for Pseudo-Labeling for Massively Multilingual Speech Recognition
Figure 4 for Pseudo-Labeling for Massively Multilingual Speech Recognition
Viaarxiv icon

Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings

Oct 30, 2022
Hao Yen, Woojay Jeon

Figure 1 for Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings
Figure 2 for Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings
Figure 3 for Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings
Figure 4 for Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings
Viaarxiv icon