Picture for Hung-yi Lee

Hung-yi Lee

Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization

Add code
Jan 23, 2024
Viaarxiv icon

Over-Reasoning and Redundant Calculation of Large Language Models

Add code
Jan 21, 2024
Viaarxiv icon

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue

Add code
Jan 17, 2024
Viaarxiv icon

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

Add code
Jan 05, 2024
Figure 1 for Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks
Figure 2 for Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks
Figure 3 for Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks
Figure 4 for Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks
Viaarxiv icon

PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques

Add code
Jan 04, 2024
Figure 1 for PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques
Figure 2 for PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques
Figure 3 for PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques
Figure 4 for PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques
Viaarxiv icon

Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision

Add code
Dec 30, 2023
Figure 1 for Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
Figure 2 for Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
Figure 3 for Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
Viaarxiv icon

GSQA: An End-to-End Model for Generative Spoken Question Answering

Add code
Dec 25, 2023
Viaarxiv icon

Noise robust distillation of self-supervised speech models via correlation metrics

Add code
Dec 19, 2023
Figure 1 for Noise robust distillation of self-supervised speech models via correlation metrics
Figure 2 for Noise robust distillation of self-supervised speech models via correlation metrics
Figure 3 for Noise robust distillation of self-supervised speech models via correlation metrics
Figure 4 for Noise robust distillation of self-supervised speech models via correlation metrics
Viaarxiv icon

Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification

Add code
Dec 14, 2023
Figure 1 for Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification
Figure 2 for Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification
Figure 3 for Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification
Figure 4 for Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification
Viaarxiv icon

Step by Step to Fairness: Attributing Societal Bias in Task-oriented Dialogue Systems

Add code
Nov 14, 2023
Figure 1 for Step by Step to Fairness: Attributing Societal Bias in Task-oriented Dialogue Systems
Figure 2 for Step by Step to Fairness: Attributing Societal Bias in Task-oriented Dialogue Systems
Figure 3 for Step by Step to Fairness: Attributing Societal Bias in Task-oriented Dialogue Systems
Figure 4 for Step by Step to Fairness: Attributing Societal Bias in Task-oriented Dialogue Systems
Viaarxiv icon