Alert button
Picture for Yuekai Zhang

Yuekai Zhang

Alert button

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

Add code
Bookmark button
Alert button
Oct 27, 2023
Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis

Figure 1 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 2 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 3 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 4 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Viaarxiv icon

LightVessel: Exploring Lightweight Coronary Artery Vessel Segmentation via Similarity Knowledge Distillation

Add code
Bookmark button
Alert button
Nov 02, 2022
Hao Dang, Yuekai Zhang, Xingqun Qi, Wanting Zhou, Muyi Sun

Figure 1 for LightVessel: Exploring Lightweight Coronary Artery Vessel Segmentation via Similarity Knowledge Distillation
Figure 2 for LightVessel: Exploring Lightweight Coronary Artery Vessel Segmentation via Similarity Knowledge Distillation
Figure 3 for LightVessel: Exploring Lightweight Coronary Artery Vessel Segmentation via Similarity Knowledge Distillation
Figure 4 for LightVessel: Exploring Lightweight Coronary Artery Vessel Segmentation via Similarity Knowledge Distillation
Viaarxiv icon

TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty

Add code
Bookmark button
Alert button
Nov 01, 2022
Xingchen Song, Di Wu, Zhiyong Wu, Binbin Zhang, Yuekai Zhang, Zhendong Peng, Wenpeng Li, Fuping Pan, Changbao Zhu

Figure 1 for TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
Figure 2 for TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
Figure 3 for TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
Figure 4 for TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
Viaarxiv icon

ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet

Add code
Bookmark button
Alert button
Nov 29, 2021
Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc Thang Vu, Alan W Black, Shinji Watanabe

Figure 1 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 2 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 3 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 4 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Viaarxiv icon

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

Add code
Bookmark button
Alert button
Apr 06, 2021
Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko

Figure 1 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 2 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 3 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 4 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Viaarxiv icon

Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices

Add code
Bookmark button
Alert button
Feb 07, 2021
Yuekai Zhang, Sining Sun, Long Ma

Figure 1 for Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Figure 2 for Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Figure 3 for Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Figure 4 for Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Viaarxiv icon

Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss

Add code
Bookmark button
Alert button
Oct 22, 2020
Jiatong Shi, Shuai Guo, Nan Huo, Yuekai Zhang, Qin Jin

Figure 1 for Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss
Figure 2 for Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss
Figure 3 for Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss
Figure 4 for Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss
Viaarxiv icon