Picture for Hiroshi Sato

Hiroshi Sato

All-in-One ASR: Unifying Encoder-Decoder Models of CTC, Attention, and Transducer in Dual-Mode ASR

Add code
Dec 12, 2025
Viaarxiv icon

Accuracy-Preserving CNN Pruning Method under Limited Data Availability

Add code
Nov 13, 2025
Viaarxiv icon

Generic Speech Enhancement with Self-Supervised Representation Space Loss

Add code
Jul 10, 2025
Viaarxiv icon

Motion Generation for Food Topping Challenge 2024: Serving Salmon Roe Bowl and Picking Fried Chicken

Add code
Apr 30, 2025
Figure 1 for Motion Generation for Food Topping Challenge 2024: Serving Salmon Roe Bowl and Picking Fried Chicken
Figure 2 for Motion Generation for Food Topping Challenge 2024: Serving Salmon Roe Bowl and Picking Fried Chicken
Figure 3 for Motion Generation for Food Topping Challenge 2024: Serving Salmon Roe Bowl and Picking Fried Chicken
Figure 4 for Motion Generation for Food Topping Challenge 2024: Serving Salmon Roe Bowl and Picking Fried Chicken
Viaarxiv icon

Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning

Add code
Dec 04, 2024
Figure 1 for Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Figure 2 for Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Figure 3 for Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Figure 4 for Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Viaarxiv icon

Error-Feedback Model for Output Correction in Bilateral Control-Based Imitation Learning

Add code
Nov 19, 2024
Viaarxiv icon

Guided Speaker Embedding

Add code
Oct 16, 2024
Figure 1 for Guided Speaker Embedding
Figure 2 for Guided Speaker Embedding
Figure 3 for Guided Speaker Embedding
Figure 4 for Guided Speaker Embedding
Viaarxiv icon

Investigation of Speaker Representation for Target-Speaker Speech Processing

Add code
Oct 15, 2024
Figure 1 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 2 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 3 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 4 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Viaarxiv icon

Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding

Add code
Sep 30, 2024
Figure 1 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 2 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 3 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 4 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Viaarxiv icon

Alignment-Free Training for Transducer-based Multi-Talker ASR

Add code
Sep 30, 2024
Figure 1 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 2 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 3 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 4 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Viaarxiv icon