Picture for Parisa Haghani

Parisa Haghani

ASTRA: Aligning Speech and Text Representations for Asr without Sampling

Add code
Jun 10, 2024
Viaarxiv icon

Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition

Add code
Oct 17, 2023
Figure 1 for Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition
Figure 2 for Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition
Figure 3 for Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition
Figure 4 for Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition
Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Add code
Aug 14, 2023
Figure 1 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 2 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 3 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 4 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Viaarxiv icon

Universal Automatic Phonetic Transcription into the International Phonetic Alphabet

Add code
Aug 07, 2023
Figure 1 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 2 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 3 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 4 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Viaarxiv icon

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Mar 03, 2023
Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon

Accelerating RNN-T Training and Inference Using CTC guidance

Add code
Oct 29, 2022
Figure 1 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 2 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 3 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 4 for Accelerating RNN-T Training and Inference Using CTC guidance
Viaarxiv icon

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification

Add code
Sep 13, 2022
Figure 1 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 2 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 3 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 4 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Viaarxiv icon

A Language Agnostic Multilingual Streaming On-Device ASR System

Add code
Aug 29, 2022
Figure 1 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 2 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 3 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 4 for A Language Agnostic Multilingual Streaming On-Device ASR System
Viaarxiv icon

Unsupervised Data Selection via Discrete Speech Representation for ASR

Add code
Apr 05, 2022
Figure 1 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 2 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 3 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 4 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Viaarxiv icon

Scaling End-to-End Models for Large-Scale Multilingual ASR

Add code
Apr 30, 2021
Figure 1 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Figure 2 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Figure 3 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Viaarxiv icon