Alert button
Picture for Ryo Masumura

Ryo Masumura

Alert button

Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff

Add code
Bookmark button
Alert button
Aug 31, 2023
Satoshi Suzuki, Shin'ya Yamaguchi, Shoichiro Takeda, Sekitoshi Kanai, Naoki Makishima, Atsushi Ando, Ryo Masumura

Figure 1 for Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Figure 2 for Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Figure 3 for Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Figure 4 for Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Viaarxiv icon

End-to-End Joint Target and Non-Target Speakers ASR

Add code
Bookmark button
Alert button
Jun 04, 2023
Ryo Masumura, Naoki Makishima, Taiga Yamane, Yoshihiko Yamazaki, Saki Mizuno, Mana Ihori, Mihiro Uchida, Keita Suzuki, Hiroshi Sato, Tomohiro Tanaka, Akihiko Takashima, Satoshi Suzuki, Takafumi Moriya, Nobukatsu Hojo, Atsushi Ando

Figure 1 for End-to-End Joint Target and Non-Target Speakers ASR
Figure 2 for End-to-End Joint Target and Non-Target Speakers ASR
Figure 3 for End-to-End Joint Target and Non-Target Speakers ASR
Viaarxiv icon

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data

Add code
Bookmark button
Alert button
May 25, 2023
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa, Taichi Asami

Figure 1 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 2 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 3 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 4 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Viaarxiv icon

Improving Scheduled Sampling for Neural Transducer-based ASR

Add code
Bookmark button
Alert button
May 25, 2023
Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura

Figure 1 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 2 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 3 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 4 for Improving Scheduled Sampling for Neural Transducer-based ASR
Viaarxiv icon

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss

Add code
Bookmark button
Alert button
May 24, 2023
Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka, Nobukatsu Hojo

Figure 1 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 2 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 3 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 4 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Viaarxiv icon

Leveraging Large Text Corpora for End-to-End Speech Summarization

Add code
Bookmark button
Alert button
Mar 02, 2023
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Atsunori Ogawa, Marc Delcroix, Ryo Masumura

Figure 1 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 2 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 3 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 4 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Viaarxiv icon

On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis

Add code
Bookmark button
Alert button
Oct 28, 2022
Atsushi Ando, Ryo Masumura, Akihiko Takashima, Satoshi Suzuki, Naoki Makishima, Keita Suzuki, Takafumi Moriya, Takanori Ashihara, Hiroshi Sato

Figure 1 for On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Figure 2 for On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Figure 3 for On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Figure 4 for On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Viaarxiv icon

Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data

Add code
Bookmark button
Alert button
Jul 11, 2022
Naoki Makishima, Satoshi Suzuki, Atsushi Ando, Ryo Masumura

Figure 1 for Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
Figure 2 for Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
Figure 3 for Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
Viaarxiv icon

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations

Add code
Bookmark button
Alert button
Jun 16, 2022
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Ryo Masumura

Figure 1 for Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Figure 2 for Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Figure 3 for Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Figure 4 for Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Viaarxiv icon

Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations

Add code
Bookmark button
Alert button
Feb 21, 2022
Yoshihiro Yamazaki, Shota Orihashi, Ryo Masumura, Mihiro Uchida, Akihiko Takashima

Figure 1 for Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Figure 2 for Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Figure 3 for Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Figure 4 for Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Viaarxiv icon