Alert button

"speech": models, code, and papers
Alert button

X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages

May 10, 2023
Feilong Chen, Minglun Han, Haozhi Zhao, Qingyang Zhang, Jing Shi, Shuang Xu, Bo Xu

Figure 1 for X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Figure 2 for X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Figure 3 for X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Figure 4 for X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Viaarxiv icon

Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR

Apr 28, 2023
Ruchao Fan, Yunzheng Zhu, Jinhan Wang, Abeer Alwan

Figure 1 for Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Figure 2 for Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Figure 3 for Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Figure 4 for Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Viaarxiv icon

Explore, Establish, Exploit: Red Teaming Language Models from Scratch

Jun 21, 2023
Stephen Casper, Jason Lin, Joe Kwon, Gatlen Culp, Dylan Hadfield-Menell

Figure 1 for Explore, Establish, Exploit: Red Teaming Language Models from Scratch
Figure 2 for Explore, Establish, Exploit: Red Teaming Language Models from Scratch
Figure 3 for Explore, Establish, Exploit: Red Teaming Language Models from Scratch
Figure 4 for Explore, Establish, Exploit: Red Teaming Language Models from Scratch
Viaarxiv icon

Variational Speech Waveform Compression to Catalyze Semantic Communications

Dec 13, 2022
Shengshi Yao, Zixuan Xiao, Sixian Wang, Jincheng Dai, Kai Niu, Ping Zhang

Figure 1 for Variational Speech Waveform Compression to Catalyze Semantic Communications
Figure 2 for Variational Speech Waveform Compression to Catalyze Semantic Communications
Figure 3 for Variational Speech Waveform Compression to Catalyze Semantic Communications
Figure 4 for Variational Speech Waveform Compression to Catalyze Semantic Communications
Viaarxiv icon

TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge

Mar 14, 2023
Yukai Ju, Jun Chen, Shimin Zhang, Shulin He, Wei Rao, Weixin Zhu, Yannan Wang, Tao Yu, Shidong Shang

Figure 1 for TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge
Figure 2 for TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge
Figure 3 for TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge
Viaarxiv icon

QVoice: Arabic Speech Pronunciation Learning Application

May 09, 2023
Yassine El Kheir, Fouad Khnaisser, Shammur Absar Chowdhury, Hamdy Mubarak, Shazia Afzal, Ahmed Ali

Figure 1 for QVoice: Arabic Speech Pronunciation Learning Application
Viaarxiv icon

Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition

Nov 10, 2022
Zili Huang, Zhuo Chen, Naoyuki Kanda, Jian Wu, Yiming Wang, Jinyu Li, Takuya Yoshioka, Xiaofei Wang, Peidong Wang

Figure 1 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 2 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 3 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 4 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Viaarxiv icon

Wireless Deep Speech Semantic Transmission

Nov 04, 2022
Zixuan Xiao, Shengshi Yao, Jincheng Dai, Sixian Wang, Kai Niu, Ping Zhang

Figure 1 for Wireless Deep Speech Semantic Transmission
Figure 2 for Wireless Deep Speech Semantic Transmission
Figure 3 for Wireless Deep Speech Semantic Transmission
Figure 4 for Wireless Deep Speech Semantic Transmission
Viaarxiv icon

Extending Audio Masked Autoencoders Toward Audio Restoration

May 11, 2023
Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji

Figure 1 for Extending Audio Masked Autoencoders Toward Audio Restoration
Figure 2 for Extending Audio Masked Autoencoders Toward Audio Restoration
Figure 3 for Extending Audio Masked Autoencoders Toward Audio Restoration
Figure 4 for Extending Audio Masked Autoencoders Toward Audio Restoration
Viaarxiv icon

Analysing Discrete Self Supervised Speech Representation for Spoken Language Modeling

Jan 02, 2023
Amitay Sicherman, Yossi Adi

Figure 1 for Analysing Discrete Self Supervised Speech Representation for Spoken Language Modeling
Figure 2 for Analysing Discrete Self Supervised Speech Representation for Spoken Language Modeling
Figure 3 for Analysing Discrete Self Supervised Speech Representation for Spoken Language Modeling
Figure 4 for Analysing Discrete Self Supervised Speech Representation for Spoken Language Modeling
Viaarxiv icon