Picture for Chanwoo Kim

Chanwoo Kim

AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition

Add code
Mar 18, 2024
Figure 1 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 2 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 3 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 4 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Viaarxiv icon

Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution

Add code
Jan 29, 2024
Viaarxiv icon

Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech

Add code
Jan 19, 2024
Figure 1 for Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech
Figure 2 for Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech
Figure 3 for Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech
Figure 4 for Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech
Viaarxiv icon

On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition

Add code
Dec 15, 2023
Figure 1 for On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition
Figure 2 for On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition
Figure 3 for On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition
Figure 4 for On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition
Viaarxiv icon

Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy

Add code
Dec 14, 2023
Figure 1 for Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy
Figure 2 for Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy
Figure 3 for Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy
Figure 4 for Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy
Viaarxiv icon

Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis

Add code
Oct 05, 2023
Figure 1 for Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis
Figure 2 for Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis
Figure 3 for Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis
Figure 4 for Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis
Viaarxiv icon

Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction

Add code
Aug 16, 2023
Figure 1 for Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction
Figure 2 for Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction
Figure 3 for Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction
Figure 4 for Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction
Viaarxiv icon

Macro-block dropout for improved regularization in training end-to-end speech recognition models

Add code
Dec 29, 2022
Figure 1 for Macro-block dropout for improved regularization in training end-to-end speech recognition models
Figure 2 for Macro-block dropout for improved regularization in training end-to-end speech recognition models
Figure 3 for Macro-block dropout for improved regularization in training end-to-end speech recognition models
Figure 4 for Macro-block dropout for improved regularization in training end-to-end speech recognition models
Viaarxiv icon

An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space

Add code
Nov 06, 2022
Figure 1 for An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Figure 2 for An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Figure 3 for An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Figure 4 for An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Viaarxiv icon

Transformer-based Global 3D Hand Pose Estimation in Two Hands Manipulating Objects Scenarios

Add code
Oct 20, 2022
Figure 1 for Transformer-based Global 3D Hand Pose Estimation in Two Hands Manipulating Objects Scenarios
Figure 2 for Transformer-based Global 3D Hand Pose Estimation in Two Hands Manipulating Objects Scenarios
Figure 3 for Transformer-based Global 3D Hand Pose Estimation in Two Hands Manipulating Objects Scenarios
Figure 4 for Transformer-based Global 3D Hand Pose Estimation in Two Hands Manipulating Objects Scenarios
Viaarxiv icon