Picture for Tomohiro Tanaka

Tomohiro Tanaka

Attention as Annotation: Generating Images and Pseudo-masks for Weakly Supervised Semantic Segmentation with Diffusion

Sep 04, 2023
Figure 1 for Attention as Annotation: Generating Images and Pseudo-masks for Weakly Supervised Semantic Segmentation with Diffusion
Figure 2 for Attention as Annotation: Generating Images and Pseudo-masks for Weakly Supervised Semantic Segmentation with Diffusion
Figure 3 for Attention as Annotation: Generating Images and Pseudo-masks for Weakly Supervised Semantic Segmentation with Diffusion
Figure 4 for Attention as Annotation: Generating Images and Pseudo-masks for Weakly Supervised Semantic Segmentation with Diffusion
Viaarxiv icon

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

Add code
Jun 14, 2023
Figure 1 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 2 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 3 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 4 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Viaarxiv icon

Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization

Jun 07, 2023
Figure 1 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 2 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 3 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 4 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Viaarxiv icon

End-to-End Joint Target and Non-Target Speakers ASR

Jun 04, 2023
Figure 1 for End-to-End Joint Target and Non-Target Speakers ASR
Figure 2 for End-to-End Joint Target and Non-Target Speakers ASR
Figure 3 for End-to-End Joint Target and Non-Target Speakers ASR
Viaarxiv icon

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data

May 25, 2023
Figure 1 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 2 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 3 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 4 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Viaarxiv icon

Improving Scheduled Sampling for Neural Transducer-based ASR

May 25, 2023
Figure 1 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 2 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 3 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 4 for Improving Scheduled Sampling for Neural Transducer-based ASR
Viaarxiv icon

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss

May 24, 2023
Figure 1 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 2 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 3 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 4 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Viaarxiv icon

Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models

Add code
May 09, 2023
Figure 1 for Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Figure 2 for Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Figure 3 for Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Figure 4 for Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Viaarxiv icon

Leveraging Large Text Corpora for End-to-End Speech Summarization

Add code
Mar 02, 2023
Figure 1 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 2 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 3 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 4 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Viaarxiv icon

Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning

Nov 25, 2022
Figure 1 for Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning
Figure 2 for Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning
Figure 3 for Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning
Figure 4 for Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning
Viaarxiv icon