Alert button
Picture for Zhihao Du

Zhihao Du

Alert button

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

Add code
Bookmark button
Alert button
Feb 13, 2024
Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen

Viaarxiv icon

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

Add code
Bookmark button
Alert button
Oct 11, 2023
Jiaming Wang, Zhihao Du, Qian Chen, Yunfei Chu, Zhifu Gao, Zerui Li, Kai Hu, Xiaohuan Zhou, Jin Xu, Ziyang Ma, Wen Wang, Siqi Zheng, Chang Zhou, Zhijie Yan, Shiliang Zhang

Figure 1 for LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Figure 2 for LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Figure 3 for LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Figure 4 for LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Viaarxiv icon

SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR

Add code
Bookmark button
Alert button
Oct 07, 2023
Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei Xie

Figure 1 for SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Figure 2 for SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Figure 3 for SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Figure 4 for SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Viaarxiv icon

The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR

Add code
Bookmark button
Alert button
Sep 24, 2023
Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu

Figure 1 for The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Figure 2 for The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Figure 3 for The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Figure 4 for The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Viaarxiv icon

FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec

Add code
Bookmark button
Alert button
Sep 14, 2023
Zhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng

Figure 1 for FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
Figure 2 for FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
Figure 3 for FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
Figure 4 for FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
Viaarxiv icon

CASA-ASR: Context-Aware Speaker-Attributed ASR

Add code
Bookmark button
Alert button
May 21, 2023
Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai

Figure 1 for CASA-ASR: Context-Aware Speaker-Attributed ASR
Figure 2 for CASA-ASR: Context-Aware Speaker-Attributed ASR
Figure 3 for CASA-ASR: Context-Aware Speaker-Attributed ASR
Figure 4 for CASA-ASR: Context-Aware Speaker-Attributed ASR
Viaarxiv icon

FunASR: A Fundamental End-to-End Speech Recognition Toolkit

Add code
Bookmark button
Alert button
May 18, 2023
Zhifu Gao, Zerui Li, Jiaming Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Zhangyu Xiao, Shiliang Zhang

Figure 1 for FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Figure 2 for FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Figure 3 for FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Figure 4 for FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Viaarxiv icon

TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization

Add code
Bookmark button
Alert button
Mar 08, 2023
Jiaming Wang, Zhihao Du, Shiliang Zhang

Figure 1 for TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization
Figure 2 for TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization
Figure 3 for TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization
Figure 4 for TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization
Viaarxiv icon

Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis

Add code
Bookmark button
Alert button
Nov 18, 2022
Zhihao Du, Shiliang Zhang, Siqi Zheng, Zhijie Yan

Figure 1 for Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Figure 2 for Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Figure 3 for Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Figure 4 for Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Viaarxiv icon

A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings

Add code
Bookmark button
Alert button
Nov 01, 2022
Mohan Shi, Jie Zhang, Zhihao Du, Fan Yu, Shiliang Zhang, Li-Rong Dai

Figure 1 for A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings
Figure 2 for A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings
Figure 3 for A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings
Figure 4 for A Comparative Study on multichannel Speaker-attributed automatic speech recognition in Multi-party Meetings
Viaarxiv icon