Picture for Kyu J. Han

Kyu J. Han

PEAVS: Perceptual Evaluation of Audio-Visual Synchrony Grounded in Viewers' Opinion Scores

Add code
Apr 10, 2024
Viaarxiv icon

E-Branchformer: Branchformer with Enhanced merging for speech recognition

Add code
Sep 30, 2022
Figure 1 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 2 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 3 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 4 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Viaarxiv icon

On the Use of External Data for Spoken Named Entity Recognition

Add code
Dec 14, 2021
Figure 1 for On the Use of External Data for Spoken Named Entity Recognition
Figure 2 for On the Use of External Data for Spoken Named Entity Recognition
Figure 3 for On the Use of External Data for Spoken Named Entity Recognition
Figure 4 for On the Use of External Data for Spoken Named Entity Recognition
Viaarxiv icon

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

Add code
Nov 19, 2021
Figure 1 for SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Figure 2 for SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Figure 3 for SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Figure 4 for SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Viaarxiv icon

Multi-mode Transformer Transducer with Stochastic Future Context

Add code
Jun 17, 2021
Figure 1 for Multi-mode Transformer Transducer with Stochastic Future Context
Figure 2 for Multi-mode Transformer Transducer with Stochastic Future Context
Figure 3 for Multi-mode Transformer Transducer with Stochastic Future Context
Viaarxiv icon

Leveraging Pre-trained Language Model for Speech Sentiment Analysis

Add code
Jun 11, 2021
Figure 1 for Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Figure 2 for Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Figure 3 for Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Figure 4 for Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Viaarxiv icon

A Review of Speaker Diarization: Recent Advances with Deep Learning

Add code
Jan 24, 2021
Figure 1 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 2 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 3 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 4 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Viaarxiv icon

Multistream CNN for Robust Acoustic Modeling

Add code
May 21, 2020
Figure 1 for Multistream CNN for Robust Acoustic Modeling
Figure 2 for Multistream CNN for Robust Acoustic Modeling
Figure 3 for Multistream CNN for Robust Acoustic Modeling
Figure 4 for Multistream CNN for Robust Acoustic Modeling
Viaarxiv icon

ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition

Add code
May 21, 2020
Figure 1 for ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition
Figure 2 for ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition
Figure 3 for ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition
Figure 4 for ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition
Viaarxiv icon

Speaker Diarization with Lexical Information

Add code
Apr 13, 2020
Figure 1 for Speaker Diarization with Lexical Information
Figure 2 for Speaker Diarization with Lexical Information
Figure 3 for Speaker Diarization with Lexical Information
Figure 4 for Speaker Diarization with Lexical Information
Viaarxiv icon