Alert button
Picture for Yashesh Gaur

Yashesh Gaur

Alert button

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Add code
Bookmark button
Alert button
Apr 11, 2022
Jian Xue, Peidong Wang, Jinyu Li, Matt Post, Yashesh Gaur

Figure 1 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 2 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 3 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 4 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Viaarxiv icon

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings

Add code
Bookmark button
Alert button
Mar 30, 2022
Naoyuki Kanda, Jian Wu, Yu Wu, Xiong Xiao, Zhong Meng, Xiaofei Wang, Yashesh Gaur, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Figure 1 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 2 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 3 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 4 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Viaarxiv icon

Streaming Multi-Talker ASR with Token-Level Serialized Output Training

Add code
Bookmark button
Alert button
Feb 05, 2022
Naoyuki Kanda, Jian Wu, Yu Wu, Xiong Xiao, Zhong Meng, Xiaofei Wang, Yashesh Gaur, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Figure 1 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 2 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 3 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 4 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Viaarxiv icon

Sequence-level self-learning with multiple hypotheses

Add code
Bookmark button
Alert button
Dec 10, 2021
Kenichi Kumatani, Dimitrios Dimitriadis, Yashesh Gaur, Robert Gmyr, Sefik Emre Eskimez, Jinyu Li, Michael Zeng

Figure 1 for Sequence-level self-learning with multiple hypotheses
Figure 2 for Sequence-level self-learning with multiple hypotheses
Figure 3 for Sequence-level self-learning with multiple hypotheses
Figure 4 for Sequence-level self-learning with multiple hypotheses
Viaarxiv icon

Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Oct 14, 2021
Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong

Figure 1 for Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Figure 2 for Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Viaarxiv icon

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR

Add code
Bookmark button
Alert button
Oct 07, 2021
Naoyuki Kanda, Xiong Xiao, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Figure 1 for Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Figure 2 for Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Figure 3 for Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Viaarxiv icon

Continuous Streaming Multi-Talker ASR with Dual-path Transducers

Add code
Bookmark button
Alert button
Sep 17, 2021
Desh Raj, Liang Lu, Zhuo Chen, Yashesh Gaur, Jinyu Li

Figure 1 for Continuous Streaming Multi-Talker ASR with Dual-path Transducers
Figure 2 for Continuous Streaming Multi-Talker ASR with Dual-path Transducers
Figure 3 for Continuous Streaming Multi-Talker ASR with Dual-path Transducers
Figure 4 for Continuous Streaming Multi-Talker ASR with Dual-path Transducers
Viaarxiv icon

A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio

Add code
Bookmark button
Alert button
Jul 06, 2021
Naoyuki Kanda, Xiong Xiao, Jian Wu, Tianyan Zhou, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Figure 1 for A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Figure 2 for A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Figure 3 for A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Figure 4 for A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Viaarxiv icon

Dynamic Gradient Aggregation for Federated Domain Adaptation

Add code
Bookmark button
Alert button
Jun 14, 2021
Dimitrios Dimitriadis, Kenichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez

Figure 1 for Dynamic Gradient Aggregation for Federated Domain Adaptation
Figure 2 for Dynamic Gradient Aggregation for Federated Domain Adaptation
Figure 3 for Dynamic Gradient Aggregation for Federated Domain Adaptation
Figure 4 for Dynamic Gradient Aggregation for Federated Domain Adaptation
Viaarxiv icon