Picture for Zhehuai Chen

Zhehuai Chen

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data

Add code
Jun 28, 2024
Viaarxiv icon

BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5

Add code
Jun 28, 2024
Viaarxiv icon

DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment

Add code
Jun 27, 2024
Viaarxiv icon

Instruction Data Generation and Unsupervised Adaptation for Speech Language Models

Add code
Jun 18, 2024
Figure 1 for Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Figure 2 for Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Figure 3 for Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Figure 4 for Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Viaarxiv icon

Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition

Add code
Apr 04, 2024
Viaarxiv icon

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

Add code
Feb 10, 2024
Figure 1 for GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Figure 2 for GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Figure 3 for GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Figure 4 for GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Viaarxiv icon

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Add code
Jan 08, 2024
Viaarxiv icon

SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation

Add code
Oct 13, 2023
Figure 1 for SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Figure 2 for SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Figure 3 for SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Figure 4 for SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Add code
Aug 14, 2023
Figure 1 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 2 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 3 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 4 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Viaarxiv icon

Understanding Shared Speech-Text Representations

Add code
Apr 27, 2023
Figure 1 for Understanding Shared Speech-Text Representations
Figure 2 for Understanding Shared Speech-Text Representations
Figure 3 for Understanding Shared Speech-Text Representations
Figure 4 for Understanding Shared Speech-Text Representations
Viaarxiv icon