Alert button
Picture for Mana Ihori

Mana Ihori

Alert button

End-to-End Joint Target and Non-Target Speakers ASR

Add code
Bookmark button
Alert button
Jun 04, 2023
Ryo Masumura, Naoki Makishima, Taiga Yamane, Yoshihiko Yamazaki, Saki Mizuno, Mana Ihori, Mihiro Uchida, Keita Suzuki, Hiroshi Sato, Tomohiro Tanaka, Akihiko Takashima, Satoshi Suzuki, Takafumi Moriya, Nobukatsu Hojo, Atsushi Ando

Figure 1 for End-to-End Joint Target and Non-Target Speakers ASR
Figure 2 for End-to-End Joint Target and Non-Target Speakers ASR
Figure 3 for End-to-End Joint Target and Non-Target Speakers ASR
Viaarxiv icon

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss

Add code
Bookmark button
Alert button
May 24, 2023
Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka, Nobukatsu Hojo

Figure 1 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 2 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 3 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Figure 4 for Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Viaarxiv icon

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations

Add code
Bookmark button
Alert button
Jun 16, 2022
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Ryo Masumura

Figure 1 for Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Figure 2 for Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Figure 3 for Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Figure 4 for Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Viaarxiv icon

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages

Add code
Bookmark button
Alert button
Nov 24, 2021
Shota Orihashi, Yoshihiro Yamazaki, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Ryo Masumura

Figure 1 for Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages
Figure 2 for Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages
Figure 3 for Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages
Figure 4 for Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages
Viaarxiv icon

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling

Add code
Bookmark button
Alert button
Nov 22, 2021
Shota Orihashi, Yoshihiro Yamazaki, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Ryo Masumura

Figure 1 for Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Figure 2 for Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Figure 3 for Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Figure 4 for Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Viaarxiv icon

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning

Add code
Bookmark button
Alert button
Jul 07, 2021
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Shota Orihashi, Naoki Makishima

Figure 1 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 2 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 3 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 4 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Viaarxiv icon

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jul 04, 2021
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Takafumi Moriya, Takanori Ashihara, Shota Orihashi, Naoki Makishima

Figure 1 for Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
Figure 2 for Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
Figure 3 for Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
Figure 4 for Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
Viaarxiv icon

Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation

Add code
Bookmark button
Alert button
Jul 04, 2021
Ryo Masumura, Daiki Okamura, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Shota Orihashi

Figure 1 for Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Figure 2 for Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Viaarxiv icon

Enrollment-less training for personalized voice activity detection

Add code
Bookmark button
Alert button
Jun 23, 2021
Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura

Figure 1 for Enrollment-less training for personalized voice activity detection
Figure 2 for Enrollment-less training for personalized voice activity detection
Figure 3 for Enrollment-less training for personalized voice activity detection
Viaarxiv icon

Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens

Add code
Bookmark button
Alert button
Jun 23, 2021
Mana Ihori, Naoki Makishima, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura

Figure 1 for Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens
Figure 2 for Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens
Figure 3 for Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens
Viaarxiv icon