Picture for Yosuke Higuchi

Yosuke Higuchi

SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation

Add code
Dec 24, 2025
Figure 1 for SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation
Figure 2 for SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation
Figure 3 for SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation
Figure 4 for SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation
Viaarxiv icon

SpidR: Learning Fast and Stable Linguistic Units for Spoken Language Models Without Supervision

Add code
Dec 23, 2025
Viaarxiv icon

End-to-End Speech Recognition with Pre-trained Masked Language Model

Add code
Oct 01, 2024
Figure 1 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 2 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 3 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 4 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Viaarxiv icon

Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems

Add code
Sep 30, 2024
Figure 1 for Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Figure 2 for Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Figure 3 for Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Figure 4 for Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Viaarxiv icon

Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference

Add code
Oct 01, 2023
Figure 1 for Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Figure 2 for Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Figure 3 for Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Figure 4 for Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Viaarxiv icon

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition

Add code
Sep 19, 2023
Figure 1 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 2 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 3 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Figure 4 for Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Viaarxiv icon

Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition

Add code
Sep 09, 2023
Figure 1 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 2 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 3 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Figure 4 for Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Viaarxiv icon

A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding

Add code
Nov 10, 2022
Figure 1 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 2 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 3 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 4 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Viaarxiv icon

BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder

Add code
Nov 02, 2022
Figure 1 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 2 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 3 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 4 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Viaarxiv icon

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

Add code
Nov 02, 2022
Figure 1 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 2 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 3 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 4 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Viaarxiv icon