Alert button
Picture for Chao-Han Huck Yang

Chao-Han Huck Yang

Alert button

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

Feb 10, 2024
Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng

Viaarxiv icon

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

Feb 08, 2024
Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, Ensiong Chng, Chao-Han Huck Yang

Viaarxiv icon

Investigating Training Strategies and Model Robustness of Low-Rank Adaptation for Language Modeling in Speech Recognition

Jan 19, 2024
Yu Yu, Chao-Han Huck Yang, Tuan Dinh, Sungho Ryu, Jari Kolehmainen, Roger Ren, Denis Filimonov, Prashanth G. Shivakumar, Ankur Gandhe, Ariya Rastow, Jia Xu, Ivan Bulyko, Andreas Stolcke

Viaarxiv icon

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

Jan 19, 2024
Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Chao Zhang, Pin-Yu Chen, EnSiong Chng

Viaarxiv icon

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue

Jan 17, 2024
Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-yi Lee, Ivan Bulyko

Viaarxiv icon

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Dec 22, 2023
Anirudh S. Sundar, Chao-Han Huck Yang, David M. Chan, Shalini Ghosh, Venkatesh Ravichandran, Phani Sankar Nidadavolu

Figure 1 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 2 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 3 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 4 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Viaarxiv icon

Conditional Modeling Based Automatic Video Summarization

Nov 20, 2023
Jia-Hong Huang, Chao-Han Huck Yang, Pin-Yu Chen, Min-Hung Chen, Marcel Worring

Viaarxiv icon

Generative error correction for code-switching speech recognition using large language models

Oct 17, 2023
Chen Chen, Yuchen Hu, Chao-Han Huck Yang, Hexin Liu, Sabato Marco Siniscalchi, Eng Siong Chng

Figure 1 for Generative error correction for code-switching speech recognition using large language models
Figure 2 for Generative error correction for code-switching speech recognition using large language models
Figure 3 for Generative error correction for code-switching speech recognition using large language models
Figure 4 for Generative error correction for code-switching speech recognition using large language models
Viaarxiv icon

Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition

Oct 16, 2023
Srijith Radhakrishnan, Chao-Han Huck Yang, Sumeer Ahmad Khan, Rohit Kumar, Narsis A. Kiani, David Gomez-Cabrero, Jesper N. Tegner

Figure 1 for Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
Figure 2 for Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
Figure 3 for Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
Figure 4 for Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
Viaarxiv icon