Alert button
Picture for Hank Liao

Hank Liao

Alert button

Google Inc

DiarizationLM: Speaker Diarization Post-Processing with Large Language Models

Add code
Bookmark button
Alert button
Jan 16, 2024
Quan Wang, Yiling Huang, Guanlong Zhao, Evan Clark, Wei Xia, Hank Liao

Viaarxiv icon

On Robustness to Missing Video for Audiovisual Speech Recognition

Add code
Bookmark button
Alert button
Dec 19, 2023
Oscar Chang, Otavio Braga, Hank Liao, Dmitriy Serdyuk, Olivier Siohan

Figure 1 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 2 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 3 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 4 for On Robustness to Missing Video for Audiovisual Speech Recognition
Viaarxiv icon

Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network

Add code
Bookmark button
Alert button
Sep 15, 2023
Yiling Huang, Weiran Wang, Guanlong Zhao, Hank Liao, Wei Xia, Quan Wang

Figure 1 for Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Figure 2 for Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Figure 3 for Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Figure 4 for Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Viaarxiv icon

USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models

Add code
Bookmark button
Alert button
Sep 14, 2023
Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang

Figure 1 for USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Figure 2 for USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Figure 3 for USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Figure 4 for USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Viaarxiv icon

Conformers are All You Need for Visual Speech Recogntion

Add code
Bookmark button
Alert button
Feb 17, 2023
Oscar Chang, Hank Liao, Dmitriy Serdyuk, Ankit Shah, Olivier Siohan

Figure 1 for Conformers are All You Need for Visual Speech Recogntion
Figure 2 for Conformers are All You Need for Visual Speech Recogntion
Figure 3 for Conformers are All You Need for Visual Speech Recogntion
Figure 4 for Conformers are All You Need for Visual Speech Recogntion
Viaarxiv icon

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition

Add code
Bookmark button
Alert button
May 11, 2022
Otavio Braga, Takaki Makino, Olivier Siohan, Hank Liao

Figure 1 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 2 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 3 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 4 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Viaarxiv icon

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Nov 08, 2019
Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan

Figure 1 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 2 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 3 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 4 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Viaarxiv icon

A comparison of end-to-end models for long-form speech recognition

Add code
Bookmark button
Alert button
Nov 06, 2019
Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara Sainath, Yonghui Wu

Figure 1 for A comparison of end-to-end models for long-form speech recognition
Figure 2 for A comparison of end-to-end models for long-form speech recognition
Figure 3 for A comparison of end-to-end models for long-form speech recognition
Viaarxiv icon

Adversarial Training for Multilingual Acoustic Modeling

Add code
Bookmark button
Alert button
Jun 17, 2019
Ke Hu, Hasim Sak, Hank Liao

Figure 1 for Adversarial Training for Multilingual Acoustic Modeling
Figure 2 for Adversarial Training for Multilingual Acoustic Modeling
Figure 3 for Adversarial Training for Multilingual Acoustic Modeling
Figure 4 for Adversarial Training for Multilingual Acoustic Modeling
Viaarxiv icon

Neural Language Modeling with Visual Features

Add code
Bookmark button
Alert button
Mar 07, 2019
Antonios Anastasopoulos, Shankar Kumar, Hank Liao

Figure 1 for Neural Language Modeling with Visual Features
Figure 2 for Neural Language Modeling with Visual Features
Figure 3 for Neural Language Modeling with Visual Features
Figure 4 for Neural Language Modeling with Visual Features
Viaarxiv icon