Alert button

"speech": models, code, and papers
Alert button

Using a Large Language Model to Control Speaking Style for Expressive TTS

Add code
Bookmark button
Alert button
May 17, 2023
Atli Thor Sigurgeirsson, Simon King

Figure 1 for Using a Large Language Model to Control Speaking Style for Expressive TTS
Figure 2 for Using a Large Language Model to Control Speaking Style for Expressive TTS
Figure 3 for Using a Large Language Model to Control Speaking Style for Expressive TTS
Figure 4 for Using a Large Language Model to Control Speaking Style for Expressive TTS
Viaarxiv icon

2nd Swiss German Speech to Standard German Text Shared Task at SwissText 2022

Add code
Bookmark button
Alert button
Jan 17, 2023
Michel Plüss, Yanick Schraner, Christian Scheller, Manfred Vogel

Figure 1 for 2nd Swiss German Speech to Standard German Text Shared Task at SwissText 2022
Viaarxiv icon

MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems

Mar 10, 2023
Aminul Huq, Weiyi Zhang, Xiaolin Hu

Figure 1 for MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
Figure 2 for MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
Figure 3 for MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
Figure 4 for MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
Viaarxiv icon

Speech-to-Speech Translation For A Real-world Unwritten Language

Add code
Bookmark button
Alert button
Nov 11, 2022
Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Pino, Wei-Ning Hsu, Ann Lee

Figure 1 for Speech-to-Speech Translation For A Real-world Unwritten Language
Figure 2 for Speech-to-Speech Translation For A Real-world Unwritten Language
Figure 3 for Speech-to-Speech Translation For A Real-world Unwritten Language
Figure 4 for Speech-to-Speech Translation For A Real-world Unwritten Language
Viaarxiv icon

Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys

Add code
Bookmark button
Alert button
Feb 21, 2023
Henk van den Heuvel, Martijn Bentum, Simone Wills, Judith C. Koops

Figure 1 for Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys
Figure 2 for Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys
Figure 3 for Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys
Figure 4 for Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys
Viaarxiv icon

An ASR-Based Tutor for Learning to Read: How to Optimize Feedback to First Graders

Jun 07, 2023
Yu Bai, Cristian Tejedor-Garcia, Ferdy Hubers, Catia Cucchiarini, Helmer Strik

Viaarxiv icon

Anytime, Anywhere: Human Arm Pose from Smartwatch Data for Ubiquitous Robot Control and Teleoperation

Jun 22, 2023
Fabian C Weigend, Shubham Sonawani, Michael Drolet, Heni Ben Amor

Figure 1 for Anytime, Anywhere: Human Arm Pose from Smartwatch Data for Ubiquitous Robot Control and Teleoperation
Figure 2 for Anytime, Anywhere: Human Arm Pose from Smartwatch Data for Ubiquitous Robot Control and Teleoperation
Figure 3 for Anytime, Anywhere: Human Arm Pose from Smartwatch Data for Ubiquitous Robot Control and Teleoperation
Figure 4 for Anytime, Anywhere: Human Arm Pose from Smartwatch Data for Ubiquitous Robot Control and Teleoperation
Viaarxiv icon

Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech

Add code
Bookmark button
Alert button
Feb 27, 2023
Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari

Figure 1 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 2 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 3 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 4 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Viaarxiv icon

Relate auditory speech to EEG by shallow-deep attention-based network

Add code
Bookmark button
Alert button
Mar 20, 2023
Fan Cui, Liyong Guo, Lang He, Jiyao Liu, ErCheng Pei, Yujun Wang, Dongmei Jiang

Figure 1 for Relate auditory speech to EEG by shallow-deep attention-based network
Figure 2 for Relate auditory speech to EEG by shallow-deep attention-based network
Figure 3 for Relate auditory speech to EEG by shallow-deep attention-based network
Viaarxiv icon

Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities

Add code
Bookmark button
Alert button
Jul 04, 2023
Riccardo Orlando, Simone Conia, Roberto Navigli

Figure 1 for Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Figure 2 for Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Figure 3 for Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Figure 4 for Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Viaarxiv icon