Alert button
Picture for Wei Han

Wei Han

Alert button

SAS Video-QA: Self-Adaptive Sampling for Efficient Video Question-Answering

Add code
Bookmark button
Alert button
Jul 09, 2023
Wei Han, Hui Chen, Min-Yen Kan, Soujanya Poria

Figure 1 for SAS Video-QA: Self-Adaptive Sampling for Efficient Video Question-Answering
Figure 2 for SAS Video-QA: Self-Adaptive Sampling for Efficient Video Question-Answering
Figure 3 for SAS Video-QA: Self-Adaptive Sampling for Efficient Video Question-Answering
Figure 4 for SAS Video-QA: Self-Adaptive Sampling for Efficient Video Question-Answering
Viaarxiv icon

AudioPaLM: A Large Language Model That Can Speak and Listen

Add code
Bookmark button
Alert button
Jun 22, 2023
Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Frank

Figure 1 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 2 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 3 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 4 for AudioPaLM: A Large Language Model That Can Speak and Listen
Viaarxiv icon

Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding

Add code
Bookmark button
Alert button
Jun 08, 2023
Mingqiu Wang, Izhak Shafran, Hagen Soltau, Wei Han, Yuan Cao, Dian Yu, Laurent El Shafey

Figure 1 for Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Figure 2 for Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Figure 3 for Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Figure 4 for Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Viaarxiv icon

Label Aware Speech Representation Learning For Language Identification

Add code
Bookmark button
Alert button
Jun 07, 2023
Shikhar Vashishth, Shikhar Bharadwaj, Sriram Ganapathy, Ankur Bapna, Min Ma, Wei Han, Vera Axelrod, Partha Talukdar

Figure 1 for Label Aware Speech Representation Learning For Language Identification
Figure 2 for Label Aware Speech Representation Learning For Language Identification
Figure 3 for Label Aware Speech Representation Learning For Language Identification
Figure 4 for Label Aware Speech Representation Learning For Language Identification
Viaarxiv icon

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

Add code
Bookmark button
Alert button
May 30, 2023
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna

Figure 1 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 2 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 3 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 4 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Viaarxiv icon

Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction

Add code
Bookmark button
Alert button
May 23, 2023
Yew Ken Chia, Hui Chen, Wei Han, Guizhen Chen, Sharifah Mahani Aljunied, Soujanya Poria, Lidong Bing

Figure 1 for Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Figure 2 for Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Figure 3 for Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Figure 4 for Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Viaarxiv icon

Modular CSI Quantization for FDD Massive MIMO Communication

Add code
Bookmark button
Alert button
Mar 23, 2023
Jialing Liao, Roope Vehkalahti, Tefjol Pllaha, Wei Han, Olav Tirkkonen

Figure 1 for Modular CSI Quantization for FDD Massive MIMO Communication
Figure 2 for Modular CSI Quantization for FDD Massive MIMO Communication
Figure 3 for Modular CSI Quantization for FDD Massive MIMO Communication
Figure 4 for Modular CSI Quantization for FDD Massive MIMO Communication
Viaarxiv icon

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

Add code
Bookmark button
Alert button
Mar 03, 2023
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani

Figure 1 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 2 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 3 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 4 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Viaarxiv icon

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Bookmark button
Alert button
Mar 03, 2023
Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu

Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon