Alert button

"speech recognition": models, code, and papers
Alert button

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

Add code
Bookmark button
Alert button
Oct 10, 2023
Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

Figure 1 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 2 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 3 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 4 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Viaarxiv icon

On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Oct 12, 2023
Nick Rossenbach, Benedikt Hilmes, Ralf Schlüter

Figure 1 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 2 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 3 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 4 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Viaarxiv icon

Acoustic characterization of speech rhythm: going beyond metrics with recurrent neural networks

Jan 22, 2024
François Deloche, Laurent Bonnasse-Gahot, Judit Gervain

Viaarxiv icon

SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge

Sep 04, 2023
Jiaxu Zhu, Changhe Song, Zhiyong Wu, Helen Meng

Figure 1 for SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Figure 2 for SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Figure 3 for SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Figure 4 for SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Viaarxiv icon

SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition

Add code
Bookmark button
Alert button
Sep 29, 2023
Hongfei Xue, Qijie Shao, Kaixun Huang, Peikun Chen, Lei Xie, Jie Liu

Figure 1 for SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition
Figure 2 for SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition
Figure 3 for SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition
Figure 4 for SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition
Viaarxiv icon

Batched Low-Rank Adaptation of Foundation Models

Dec 09, 2023
Yeming Wen, Swarat Chaudhuri

Figure 1 for Batched Low-Rank Adaptation of Foundation Models
Figure 2 for Batched Low-Rank Adaptation of Foundation Models
Figure 3 for Batched Low-Rank Adaptation of Foundation Models
Figure 4 for Batched Low-Rank Adaptation of Foundation Models
Viaarxiv icon

BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition

Oct 04, 2023
Peikun Chen, Fan Yu, Yuhao Lian, Hongfei Xue, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

Figure 1 for BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Figure 2 for BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Figure 3 for BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Figure 4 for BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Viaarxiv icon

Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition and Phoneme to Grapheme Translation

Dec 06, 2023
Wonjun Lee, Gary Geunbae Lee, Yunsu Kim

Viaarxiv icon

Graph Convolutions Enrich the Self-Attention in Transformers!

Dec 07, 2023
Jeongwhan Choi, Hyowon Wi, Jayoung Kim, Yehjin Shin, Kookjin Lee, Nathaniel Trask, Noseong Park

Viaarxiv icon

Large Language Models for Autonomous Driving: Real-World Experiments

Dec 14, 2023
Can Cui, Zichong Yang, Yupeng Zhou, Yunsheng Ma, Juanwu Lu, Ziran Wang

Viaarxiv icon