Alert button

"speech": models, code, and papers
Alert button

SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling

Add code
Bookmark button
Alert button
Mar 24, 2022
Takaaki Saeki, Shinnosuke Takamichi, Tomohiko Nakamura, Naoko Tanji, Hiroshi Saruwatari

Figure 1 for SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Figure 2 for SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Figure 3 for SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Figure 4 for SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Viaarxiv icon

SepIt: Approaching a Single Channel Speech Separation Bound

May 25, 2022
Shahar Lutati, Eliya Nachmani, Lior Wolf

Figure 1 for SepIt: Approaching a Single Channel Speech Separation Bound
Figure 2 for SepIt: Approaching a Single Channel Speech Separation Bound
Figure 3 for SepIt: Approaching a Single Channel Speech Separation Bound
Figure 4 for SepIt: Approaching a Single Channel Speech Separation Bound
Viaarxiv icon

Conditional Diffusion Probabilistic Model for Speech Enhancement

Add code
Bookmark button
Alert button
Feb 10, 2022
Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Alexander Richard, Cheng Yu, Yu Tsao

Figure 1 for Conditional Diffusion Probabilistic Model for Speech Enhancement
Figure 2 for Conditional Diffusion Probabilistic Model for Speech Enhancement
Figure 3 for Conditional Diffusion Probabilistic Model for Speech Enhancement
Figure 4 for Conditional Diffusion Probabilistic Model for Speech Enhancement
Viaarxiv icon

Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment

Add code
Bookmark button
Alert button
Apr 08, 2022
Tobias Weise, Philipp Klumpp, Andreas Maier, Elmar Noeth, Bjoern Heismann, Maria Schuster, Seung Hee Yang

Figure 1 for Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment
Figure 2 for Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment
Figure 3 for Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment
Viaarxiv icon

Joint Speech Recognition and Audio Captioning

Add code
Bookmark button
Alert button
Feb 03, 2022
Chaitanya Narisetty, Emiru Tsunoo, Xuankai Chang, Yosuke Kashiwagi, Michael Hentschel, Shinji Watanabe

Figure 1 for Joint Speech Recognition and Audio Captioning
Figure 2 for Joint Speech Recognition and Audio Captioning
Figure 3 for Joint Speech Recognition and Audio Captioning
Figure 4 for Joint Speech Recognition and Audio Captioning
Viaarxiv icon

SepIt Approaching a Single Channel Speech Separation Bound

May 24, 2022
Shahar Lutati, Eliya Nachmani, Lior Wolf

Figure 1 for SepIt Approaching a Single Channel Speech Separation Bound
Figure 2 for SepIt Approaching a Single Channel Speech Separation Bound
Figure 3 for SepIt Approaching a Single Channel Speech Separation Bound
Figure 4 for SepIt Approaching a Single Channel Speech Separation Bound
Viaarxiv icon

Improving Monaural Speech Enhancement with Multi-head Self and Cross Attention

Add code
Bookmark button
Alert button
May 20, 2022
Xinmeng Xu, Jianjun Hao

Figure 1 for Improving Monaural Speech Enhancement with Multi-head Self and Cross Attention
Figure 2 for Improving Monaural Speech Enhancement with Multi-head Self and Cross Attention
Figure 3 for Improving Monaural Speech Enhancement with Multi-head Self and Cross Attention
Figure 4 for Improving Monaural Speech Enhancement with Multi-head Self and Cross Attention
Viaarxiv icon

Approaching an unknown communication system by latent space exploration and causal inference

Add code
Bookmark button
Alert button
Mar 20, 2023
Gašper Beguš, Andrej Leban, Shane Gero

Figure 1 for Approaching an unknown communication system by latent space exploration and causal inference
Figure 2 for Approaching an unknown communication system by latent space exploration and causal inference
Figure 3 for Approaching an unknown communication system by latent space exploration and causal inference
Figure 4 for Approaching an unknown communication system by latent space exploration and causal inference
Viaarxiv icon

Learnable Frontends that do not Learn: Quantifying Sensitivity to Filterbank Initialisation

Feb 20, 2023
Mark Anderson, Tomi Kinnunen, Naomi Harte

Figure 1 for Learnable Frontends that do not Learn: Quantifying Sensitivity to Filterbank Initialisation
Figure 2 for Learnable Frontends that do not Learn: Quantifying Sensitivity to Filterbank Initialisation
Figure 3 for Learnable Frontends that do not Learn: Quantifying Sensitivity to Filterbank Initialisation
Figure 4 for Learnable Frontends that do not Learn: Quantifying Sensitivity to Filterbank Initialisation
Viaarxiv icon

Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages

Add code
Bookmark button
Alert button
May 02, 2022
Felix Wu, Kwangyoun Kim, Shinji Watanabe, Kyu Han, Ryan McDonald, Kilian Q. Weinberger, Yoav Artzi

Figure 1 for Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Figure 2 for Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Figure 3 for Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Figure 4 for Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Viaarxiv icon