Alert button
Picture for Shota Orihashi

Shota Orihashi

Alert button

Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations

Add code
Bookmark button
Alert button
Feb 21, 2022
Yoshihiro Yamazaki, Shota Orihashi, Ryo Masumura, Mihiro Uchida, Akihiko Takashima

Figure 1 for Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Figure 2 for Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Figure 3 for Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Figure 4 for Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Viaarxiv icon

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages

Add code
Bookmark button
Alert button
Nov 24, 2021
Shota Orihashi, Yoshihiro Yamazaki, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Ryo Masumura

Figure 1 for Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages
Figure 2 for Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages
Figure 3 for Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages
Figure 4 for Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages
Viaarxiv icon

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling

Add code
Bookmark button
Alert button
Nov 22, 2021
Shota Orihashi, Yoshihiro Yamazaki, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Ryo Masumura

Figure 1 for Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Figure 2 for Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Figure 3 for Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Figure 4 for Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Viaarxiv icon

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning

Add code
Bookmark button
Alert button
Jul 07, 2021
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Shota Orihashi, Naoki Makishima

Figure 1 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 2 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 3 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 4 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Viaarxiv icon

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jul 04, 2021
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Takafumi Moriya, Takanori Ashihara, Shota Orihashi, Naoki Makishima

Figure 1 for Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
Figure 2 for Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
Figure 3 for Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
Figure 4 for Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
Viaarxiv icon

Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation

Add code
Bookmark button
Alert button
Jul 04, 2021
Ryo Masumura, Daiki Okamura, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Shota Orihashi

Figure 1 for Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Figure 2 for Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
Viaarxiv icon

Enrollment-less training for personalized voice activity detection

Add code
Bookmark button
Alert button
Jun 23, 2021
Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura

Figure 1 for Enrollment-less training for personalized voice activity detection
Figure 2 for Enrollment-less training for personalized voice activity detection
Figure 3 for Enrollment-less training for personalized voice activity detection
Viaarxiv icon

Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens

Add code
Bookmark button
Alert button
Jun 23, 2021
Mana Ihori, Naoki Makishima, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura

Figure 1 for Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens
Figure 2 for Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens
Figure 3 for Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens
Viaarxiv icon

Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss

Add code
Bookmark button
Alert button
Mar 02, 2021
Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Shota Orihashi, Ryo Masumura

Figure 1 for Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Figure 2 for Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Figure 3 for Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Figure 4 for Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Viaarxiv icon

Large-Context Conversational Representation Learning: Self-Supervised Learning for Conversational Documents

Add code
Bookmark button
Alert button
Feb 16, 2021
Ryo Masumura, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Shota Orihashi

Figure 1 for Large-Context Conversational Representation Learning: Self-Supervised Learning for Conversational Documents
Figure 2 for Large-Context Conversational Representation Learning: Self-Supervised Learning for Conversational Documents
Figure 3 for Large-Context Conversational Representation Learning: Self-Supervised Learning for Conversational Documents
Figure 4 for Large-Context Conversational Representation Learning: Self-Supervised Learning for Conversational Documents
Viaarxiv icon