Alert button
Picture for Jinchuan Tian

Jinchuan Tian

Alert button

OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer

Add code
Bookmark button
Alert button
Jan 30, 2024
Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, Jiatong Shi, Xuankai Chang, Jee-weon Jung, Shinji Watanabe

Viaarxiv icon

UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Add code
Bookmark button
Alert button
Oct 11, 2023
Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng

Figure 1 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 2 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 3 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 4 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Viaarxiv icon

Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data

Add code
Bookmark button
Alert button
Oct 02, 2023
Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-weon Jung, Soumi Maiti, Shinji Watanabe

Figure 1 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 2 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 3 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 4 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Viaarxiv icon

Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study

Add code
Bookmark button
Alert button
Sep 27, 2023
Xuankai Chang, Brian Yan, Kwanghee Choi, Jeeweon Jung, Yichen Lu, Soumi Maiti, Roshan Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, Hsiu-Hsuan Wang

Figure 1 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 2 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 3 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 4 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Viaarxiv icon

AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data

Add code
Bookmark button
Alert button
Sep 25, 2023
Jianwei Yu, Hangting Chen, Yanyao Bian, Xiang Li, Yi Luo, Jinchuan Tian, Mengyang Liu, Jiayi Jiang, Shuai Wang

Figure 1 for AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Figure 2 for AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Figure 3 for AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Figure 4 for AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Viaarxiv icon

Bayes Risk Transducer: Transducer with Controllable Alignment Prediction

Add code
Bookmark button
Alert button
Aug 19, 2023
Jinchuan Tian, Jianwei Yu, Hangting Chen, Brian Yan, Chao Weng, Dong Yu, Shinji Watanabe

Figure 1 for Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Figure 2 for Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Figure 3 for Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Figure 4 for Bayes Risk Transducer: Transducer with Controllable Alignment Prediction
Viaarxiv icon

HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec

Add code
Bookmark button
Alert button
May 07, 2023
Dongchao Yang, Songxiang Liu, Rongjie Huang, Jinchuan Tian, Chao Weng, Yuexian Zou

Figure 1 for HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec
Figure 2 for HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec
Viaarxiv icon

Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks

Add code
Bookmark button
Alert button
Oct 14, 2022
Jinchuan Tian, Brian Yan, Jianwei Yu, Chao Weng, Dong Yu, Shinji Watanabe

Figure 1 for Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks
Figure 2 for Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks
Figure 3 for Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks
Figure 4 for Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks
Viaarxiv icon

LAE: Language-Aware Encoder for Monolingual and Multilingual ASR

Add code
Bookmark button
Alert button
Jun 05, 2022
Jinchuan Tian, Jianwei Yu, Chunlei Zhang, Chao Weng, Yuexian Zou, Dong Yu

Figure 1 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 2 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 3 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 4 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Viaarxiv icon