Picture for Parisa Haghani

Parisa Haghani

ASTRA: Aligning Speech and Text Representations for Asr without Sampling

Jun 10, 2024
Viaarxiv icon

Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition

Oct 17, 2023
Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Aug 14, 2023
Viaarxiv icon

Universal Automatic Phonetic Transcription into the International Phonetic Alphabet

Add code
Aug 07, 2023
Figure 1 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 2 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 3 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Figure 4 for Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Viaarxiv icon

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Mar 03, 2023
Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon

Accelerating RNN-T Training and Inference Using CTC guidance

Oct 29, 2022
Figure 1 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 2 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 3 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 4 for Accelerating RNN-T Training and Inference Using CTC guidance
Viaarxiv icon

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification

Sep 13, 2022
Figure 1 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 2 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 3 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 4 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Viaarxiv icon

A Language Agnostic Multilingual Streaming On-Device ASR System

Aug 29, 2022
Figure 1 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 2 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 3 for A Language Agnostic Multilingual Streaming On-Device ASR System
Figure 4 for A Language Agnostic Multilingual Streaming On-Device ASR System
Viaarxiv icon

Unsupervised Data Selection via Discrete Speech Representation for ASR

Apr 05, 2022
Figure 1 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 2 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 3 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 4 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Viaarxiv icon

Scaling End-to-End Models for Large-Scale Multilingual ASR

Apr 30, 2021
Figure 1 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Figure 2 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Figure 3 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Viaarxiv icon