Picture for Tomoki Hayashi

Tomoki Hayashi

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Add code
Apr 11, 2023
Figure 1 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Figure 2 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Figure 3 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Figure 4 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Viaarxiv icon

Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study

Add code
Jan 26, 2023
Figure 1 for Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study
Figure 2 for Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study
Figure 3 for Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study
Figure 4 for Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study
Viaarxiv icon

ESPnet-ONNX: Bridging a Gap Between Research and Production

Add code
Sep 20, 2022
Figure 1 for ESPnet-ONNX: Bridging a Gap Between Research and Production
Figure 2 for ESPnet-ONNX: Bridging a Gap Between Research and Production
Figure 3 for ESPnet-ONNX: Bridging a Gap Between Research and Production
Figure 4 for ESPnet-ONNX: Bridging a Gap Between Research and Production
Viaarxiv icon

A Comparative Study of Self-supervised Speech Representation Based Voice Conversion

Add code
Jul 10, 2022
Figure 1 for A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Figure 2 for A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Figure 3 for A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Figure 4 for A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Viaarxiv icon

Improvement of Serial Approach to Anomalous Sound Detection by Incorporating Two Binary Cross-Entropies for Outlier Exposure

Add code
Jun 13, 2022
Figure 1 for Improvement of Serial Approach to Anomalous Sound Detection by Incorporating Two Binary Cross-Entropies for Outlier Exposure
Figure 2 for Improvement of Serial Approach to Anomalous Sound Detection by Incorporating Two Binary Cross-Entropies for Outlier Exposure
Figure 3 for Improvement of Serial Approach to Anomalous Sound Detection by Incorporating Two Binary Cross-Entropies for Outlier Exposure
Figure 4 for Improvement of Serial Approach to Anomalous Sound Detection by Incorporating Two Binary Cross-Entropies for Outlier Exposure
Viaarxiv icon

Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis

Add code
May 09, 2022
Figure 1 for Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Figure 2 for Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Figure 3 for Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Figure 4 for Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Viaarxiv icon

Acoustic Event Detection with Classifier Chains

Add code
Feb 17, 2022
Figure 1 for Acoustic Event Detection with Classifier Chains
Figure 2 for Acoustic Event Detection with Classifier Chains
Figure 3 for Acoustic Event Detection with Classifier Chains
Figure 4 for Acoustic Event Detection with Classifier Chains
Viaarxiv icon

Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem

Add code
Jan 09, 2022
Figure 1 for Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Figure 2 for Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Figure 3 for Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Figure 4 for Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Viaarxiv icon

ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel Appearance Invariant Semantic Representations

Add code
Nov 24, 2021
Figure 1 for ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel Appearance Invariant Semantic Representations
Figure 2 for ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel Appearance Invariant Semantic Representations
Figure 3 for ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel Appearance Invariant Semantic Representations
Figure 4 for ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel Appearance Invariant Semantic Representations
Viaarxiv icon

ESPnet2-TTS: Extending the Edge of TTS Research

Add code
Oct 15, 2021
Figure 1 for ESPnet2-TTS: Extending the Edge of TTS Research
Figure 2 for ESPnet2-TTS: Extending the Edge of TTS Research
Figure 3 for ESPnet2-TTS: Extending the Edge of TTS Research
Figure 4 for ESPnet2-TTS: Extending the Edge of TTS Research
Viaarxiv icon