Alert button

"speech": models, code, and papers
Alert button

Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning

Add code
Bookmark button
Alert button
Oct 27, 2020
Dongwei Jiang, Wubo Li, Miao Cao, Ruixiong Zhang, Wei Zou, Kun Han, Xiangang Li

Figure 1 for Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Figure 2 for Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Figure 3 for Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Figure 4 for Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Viaarxiv icon

SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation

Add code
Bookmark button
Alert button
Jul 27, 2022
Artem Ploujnikov, Mirco Ravanelli

Figure 1 for SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation
Figure 2 for SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation
Figure 3 for SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation
Figure 4 for SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation
Viaarxiv icon

Expressive Text-to-Speech using Style Tag

Add code
Bookmark button
Alert button
Apr 01, 2021
Minchan Kim, Sung Jun Cheon, Byoung Jin Choi, Jong Jin Kim, Nam Soo Kim

Figure 1 for Expressive Text-to-Speech using Style Tag
Figure 2 for Expressive Text-to-Speech using Style Tag
Figure 3 for Expressive Text-to-Speech using Style Tag
Figure 4 for Expressive Text-to-Speech using Style Tag
Viaarxiv icon

Magnitude or Phase? A Two Stage Algorithm for Dereverberation

Oct 31, 2022
Ayal Schwartz, Sharon Gannot, Shlomo E. Chazan

Figure 1 for Magnitude or Phase? A Two Stage Algorithm for Dereverberation
Figure 2 for Magnitude or Phase? A Two Stage Algorithm for Dereverberation
Figure 3 for Magnitude or Phase? A Two Stage Algorithm for Dereverberation
Figure 4 for Magnitude or Phase? A Two Stage Algorithm for Dereverberation
Viaarxiv icon

Fast and parallel decoding for transducer

Add code
Bookmark button
Alert button
Oct 31, 2022
Wei Kang, Liyong Guo, Fangjun Kuang, Long Lin, Mingshuang Luo, Zengwei Yao, Xiaoyu Yang, Piotr Żelasko, Daniel Povey

Figure 1 for Fast and parallel decoding for transducer
Figure 2 for Fast and parallel decoding for transducer
Figure 3 for Fast and parallel decoding for transducer
Figure 4 for Fast and parallel decoding for transducer
Viaarxiv icon

From Nano to Macro: Overview of the IEEE Bio Image and Signal Processing Technical Committee

Oct 31, 2022
Selin Aviyente, Alejandro Frangi, Erik Meijering, Arrate Muñoz-Barrutia, Michael Liebling, Dimitri Van De Ville, Jean-Christophe Olivo-Marin, Jelena Kovačević, Michael Unser

Figure 1 for From Nano to Macro: Overview of the IEEE Bio Image and Signal Processing Technical Committee
Figure 2 for From Nano to Macro: Overview of the IEEE Bio Image and Signal Processing Technical Committee
Figure 3 for From Nano to Macro: Overview of the IEEE Bio Image and Signal Processing Technical Committee
Figure 4 for From Nano to Macro: Overview of the IEEE Bio Image and Signal Processing Technical Committee
Viaarxiv icon

Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS

Add code
Bookmark button
Alert button
Jun 21, 2022
Kenta Udagawa, Yuki Saito, Hiroshi Saruwatari

Figure 1 for Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
Figure 2 for Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
Figure 3 for Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
Figure 4 for Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
Viaarxiv icon

Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations

Add code
Bookmark button
Alert button
Jul 26, 2021
Se-Yun Um, Jihyun Kim, Jihyun Lee, Sangshin Oh, Kyungguen Byun, Hong-Goo Kang

Figure 1 for Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Figure 2 for Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Figure 3 for Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Figure 4 for Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Viaarxiv icon

Complex-valued Spatial Autoencoders for Multichannel Speech Enhancement

Add code
Bookmark button
Alert button
Aug 06, 2021
Mhd Modar Halimeh, Walter Kellermann

Figure 1 for Complex-valued Spatial Autoencoders for Multichannel Speech Enhancement
Figure 2 for Complex-valued Spatial Autoencoders for Multichannel Speech Enhancement
Figure 3 for Complex-valued Spatial Autoencoders for Multichannel Speech Enhancement
Viaarxiv icon

Hierarchical Diffusion Models for Singing Voice Neural Vocoder

Add code
Bookmark button
Alert button
Oct 18, 2022
Naoya Takahashi, Mayank Kumar, Singh, Yuki Mitsufuji

Figure 1 for Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Figure 2 for Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Figure 3 for Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Figure 4 for Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Viaarxiv icon