Alert button

"speech": models, code, and papers
Alert button

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

Add code
Bookmark button
Alert button
Jun 08, 2023
Zhiyi Wang, Shaoguang Mao, Wenshan Wu, Yan Xia, Yan Deng, Jonathan Tien

Figure 1 for Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models
Figure 2 for Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models
Figure 3 for Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models
Figure 4 for Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models
Viaarxiv icon

Systematic Offensive Stereotyping (SOS) Bias in Language Models

Aug 21, 2023
Fatma Elsafoury

Figure 1 for Systematic Offensive Stereotyping (SOS) Bias in Language Models
Figure 2 for Systematic Offensive Stereotyping (SOS) Bias in Language Models
Figure 3 for Systematic Offensive Stereotyping (SOS) Bias in Language Models
Figure 4 for Systematic Offensive Stereotyping (SOS) Bias in Language Models
Viaarxiv icon

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss

May 24, 2023
Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka, Nobukatsu Hojo

Figure 1 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 2 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 3 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 4 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Viaarxiv icon

Comparative Analysis of the wav2vec 2.0 Feature Extractor

Add code
Bookmark button
Alert button
Aug 08, 2023
Peter Vieting, Ralf Schlüter, Hermann Ney

Figure 1 for Comparative Analysis of the wav2vec 2.0 Feature Extractor
Figure 2 for Comparative Analysis of the wav2vec 2.0 Feature Extractor
Figure 3 for Comparative Analysis of the wav2vec 2.0 Feature Extractor
Figure 4 for Comparative Analysis of the wav2vec 2.0 Feature Extractor
Viaarxiv icon

Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match vs. Mismatch Classification

Sep 08, 2023
Yiqian Yang, Zhengqiao Zhao, Qian Wang, Yan Yang, Jingdong Chen

Figure 1 for Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match vs. Mismatch Classification
Figure 2 for Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match vs. Mismatch Classification
Figure 3 for Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match vs. Mismatch Classification
Figure 4 for Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match vs. Mismatch Classification
Viaarxiv icon

Universal Automatic Phonetic Transcription into the International Phonetic Alphabet

Add code
Bookmark button
Alert button
Aug 07, 2023
Chihiro Taguchi, Yusuke Sakai, Parisa Haghani, David Chiang

Figure 1 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 2 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 3 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 4 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Viaarxiv icon

AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment

Add code
Bookmark button
Alert button
May 13, 2023
Ruiqi Li, Rongjie Huang, Lichao Zhang, Jinglin Liu, Zhou Zhao

Figure 1 for AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Figure 2 for AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Figure 3 for AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Figure 4 for AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Viaarxiv icon

SlothSpeech: Denial-of-service Attack Against Speech Recognition Models

Add code
Bookmark button
Alert button
Jun 01, 2023
Mirazul Haque, Rutvij Shah, Simin Chen, Berrak Şişman, Cong Liu, Wei Yang

Figure 1 for SlothSpeech: Denial-of-service Attack Against Speech Recognition Models
Figure 2 for SlothSpeech: Denial-of-service Attack Against Speech Recognition Models
Figure 3 for SlothSpeech: Denial-of-service Attack Against Speech Recognition Models
Figure 4 for SlothSpeech: Denial-of-service Attack Against Speech Recognition Models
Viaarxiv icon

Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

Jun 01, 2023
Xiaohuai Le, Tong Lei, Li Chen, Yiqing Guo, Chao He, Cheng Chen, Xianjun Xia, Hua Gao, Yijian Xiao, Piao Ding, Shenyi Song, Jing Lu

Figure 1 for Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model
Figure 2 for Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model
Figure 3 for Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model
Figure 4 for Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model
Viaarxiv icon

FonMTL: Towards Multitask Learning for the Fon Language

Add code
Bookmark button
Alert button
Sep 11, 2023
Bonaventure F. P. Dossou, Iffanice Houndayi, Pamely Zantou, Gilles Hacheme

Figure 1 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 2 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 3 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 4 for FonMTL: Towards Multitask Learning for the Fon Language
Viaarxiv icon