Picture for Bhuvana Ramabhadran

Bhuvana Ramabhadran

Zero-shot Cross-lingual Voice Transfer for TTS

Add code
Sep 20, 2024
Figure 1 for Zero-shot Cross-lingual Voice Transfer for TTS
Figure 2 for Zero-shot Cross-lingual Voice Transfer for TTS
Viaarxiv icon

STAB: Speech Tokenizer Assessment Benchmark

Add code
Sep 04, 2024
Viaarxiv icon

Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models

Add code
Jul 05, 2024
Viaarxiv icon

Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions

Add code
Jun 20, 2024
Figure 1 for Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
Figure 2 for Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
Figure 3 for Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
Figure 4 for Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
Viaarxiv icon

ASTRA: Aligning Speech and Text Representations for Asr without Sampling

Add code
Jun 10, 2024
Figure 1 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 2 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 3 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 4 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Viaarxiv icon

Text Injection for Neural Contextual Biasing

Add code
Jun 05, 2024
Figure 1 for Text Injection for Neural Contextual Biasing
Figure 2 for Text Injection for Neural Contextual Biasing
Figure 3 for Text Injection for Neural Contextual Biasing
Figure 4 for Text Injection for Neural Contextual Biasing
Viaarxiv icon

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

Add code
Feb 29, 2024
Figure 1 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 2 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 3 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 4 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Viaarxiv icon

O-1: Self-training with Oracle and 1-best Hypothesis

Add code
Aug 14, 2023
Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Add code
Aug 14, 2023
Viaarxiv icon

Large-scale Language Model Rescoring on Long-form Data

Add code
Jun 13, 2023
Viaarxiv icon