Picture for Suwon Shon

Suwon Shon

Context-aware Fine-tuning of Self-supervised Speech Models

Add code
Dec 16, 2022
Figure 1 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 2 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 3 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 4 for Context-aware Fine-tuning of Self-supervised Speech Models
Viaarxiv icon

On the Use of External Data for Spoken Named Entity Recognition

Add code
Dec 14, 2021
Figure 1 for On the Use of External Data for Spoken Named Entity Recognition
Figure 2 for On the Use of External Data for Spoken Named Entity Recognition
Figure 3 for On the Use of External Data for Spoken Named Entity Recognition
Figure 4 for On the Use of External Data for Spoken Named Entity Recognition
Viaarxiv icon

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

Add code
Nov 19, 2021
Figure 1 for SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Figure 2 for SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Figure 3 for SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Figure 4 for SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Viaarxiv icon

Leveraging Pre-trained Language Model for Speech Sentiment Analysis

Add code
Jun 11, 2021
Figure 1 for Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Figure 2 for Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Figure 3 for Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Figure 4 for Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Viaarxiv icon

Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification

Add code
May 11, 2019
Figure 1 for Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
Figure 2 for Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
Figure 3 for Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
Figure 4 for Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
Viaarxiv icon

Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain

Add code
Dec 04, 2018
Figure 1 for Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain
Figure 2 for Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain
Figure 3 for Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain
Figure 4 for Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain
Viaarxiv icon

Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion

Add code
Nov 27, 2018
Figure 1 for Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion
Figure 2 for Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion
Figure 3 for Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion
Figure 4 for Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion
Viaarxiv icon

Unsupervised Representation Learning of Speech for Dialect Identification

Add code
Sep 12, 2018
Figure 1 for Unsupervised Representation Learning of Speech for Dialect Identification
Figure 2 for Unsupervised Representation Learning of Speech for Dialect Identification
Figure 3 for Unsupervised Representation Learning of Speech for Dialect Identification
Figure 4 for Unsupervised Representation Learning of Speech for Dialect Identification
Viaarxiv icon

Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model

Add code
Sep 12, 2018
Figure 1 for Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model
Figure 2 for Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model
Figure 3 for Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model
Figure 4 for Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model
Viaarxiv icon

MIT-QCRI Arabic Dialect Identification System for the 2017 Multi-Genre Broadcast Challenge

Add code
Aug 28, 2017
Figure 1 for MIT-QCRI Arabic Dialect Identification System for the 2017 Multi-Genre Broadcast Challenge
Viaarxiv icon