Alert button

"speech": models, code, and papers
Alert button

Timestamped Embedding-Matching Acoustic-to-Word CTC ASR

Jun 20, 2023
Woojay Jeon

Figure 1 for Timestamped Embedding-Matching Acoustic-to-Word CTC ASR
Figure 2 for Timestamped Embedding-Matching Acoustic-to-Word CTC ASR
Figure 3 for Timestamped Embedding-Matching Acoustic-to-Word CTC ASR
Figure 4 for Timestamped Embedding-Matching Acoustic-to-Word CTC ASR
Viaarxiv icon

VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing

Add code
Bookmark button
Alert button
Nov 30, 2022
Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian

Figure 1 for VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Figure 2 for VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Figure 3 for VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Figure 4 for VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Viaarxiv icon

BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition

Jan 16, 2023
Will Rieger

Figure 1 for BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Figure 2 for BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Figure 3 for BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Figure 4 for BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Viaarxiv icon

Detecting and Characterizing Political Incivility on Social Media

Add code
Bookmark button
Alert button
May 24, 2023
Sagi Penzel, Nir Lotan, Alon Zoizner, Einat Minkov

Figure 1 for Detecting and Characterizing Political Incivility on Social Media
Figure 2 for Detecting and Characterizing Political Incivility on Social Media
Figure 3 for Detecting and Characterizing Political Incivility on Social Media
Figure 4 for Detecting and Characterizing Political Incivility on Social Media
Viaarxiv icon

Wireless Deep Speech Semantic Transmission

Add code
Bookmark button
Alert button
Nov 04, 2022
Zixuan Xiao, Shengshi Yao, Jincheng Dai, Sixian Wang, Kai Niu, Ping Zhang

Figure 1 for Wireless Deep Speech Semantic Transmission
Figure 2 for Wireless Deep Speech Semantic Transmission
Figure 3 for Wireless Deep Speech Semantic Transmission
Figure 4 for Wireless Deep Speech Semantic Transmission
Viaarxiv icon

Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition

Nov 10, 2022
Zili Huang, Zhuo Chen, Naoyuki Kanda, Jian Wu, Yiming Wang, Jinyu Li, Takuya Yoshioka, Xiaofei Wang, Peidong Wang

Figure 1 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 2 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 3 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 4 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Viaarxiv icon

Modular Domain Adaptation for Conformer-Based Streaming ASR

May 22, 2023
Qiujia Li, Bo Li, Dongseong Hwang, Tara N. Sainath, Pedro M. Mengibar

Figure 1 for Modular Domain Adaptation for Conformer-Based Streaming ASR
Figure 2 for Modular Domain Adaptation for Conformer-Based Streaming ASR
Figure 3 for Modular Domain Adaptation for Conformer-Based Streaming ASR
Figure 4 for Modular Domain Adaptation for Conformer-Based Streaming ASR
Viaarxiv icon

Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework

Jul 06, 2023
Eliya Segev, Maya Alroy, Ronen Katsir, Noam Wies, Ayana Shenhav, Yael Ben-Oren, David Zar, Oren Tadmor, Jacob Bitterman, Amnon Shashua, Tal Rosenwein

Figure 1 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 2 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 3 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 4 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Viaarxiv icon

Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition

Jan 10, 2023
Piyush Behre, Sharman Tan, Padma Varadharajan, Shuangyu Chang

Figure 1 for Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition
Figure 2 for Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition
Figure 3 for Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition
Figure 4 for Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional Context for Continuous Speech Recognition
Viaarxiv icon

MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion

Jun 16, 2023
Woo-Jin Chung, Doyeon Kim, Soo-Whan Chung, Hong-Goo Kang

Figure 1 for MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion
Figure 2 for MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion
Figure 3 for MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion
Figure 4 for MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion
Viaarxiv icon