Alert button
Picture for Bowen Shi

Bowen Shi

Alert button

XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

Add code
Bookmark button
Alert button
Mar 21, 2024
HyoJung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang

Figure 1 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 2 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 3 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Figure 4 for XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
Viaarxiv icon

Towards Privacy-Aware Sign Language Translation at Scale

Add code
Bookmark button
Alert button
Feb 14, 2024
Phillip Rust, Bowen Shi, Skyler Wang, Necati Cihan Camgöz, Jean Maillard

Viaarxiv icon

UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding

Add code
Bookmark button
Alert button
Jan 18, 2024
Bowen Shi, Peisen Zhao, Zichen Wang, Yuhang Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian, Xiaopeng Zhang

Viaarxiv icon

Audiobox: Unified Audio Generation with Natural Language Prompts

Add code
Bookmark button
Alert button
Dec 25, 2023
Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu

Viaarxiv icon

AiluRus: A Scalable ViT Framework for Dense Prediction

Add code
Bookmark button
Alert button
Nov 02, 2023
Jin Li, Yaoming Wang, Xiaopeng Zhang, Bowen Shi, Dongsheng Jiang, Chenglin Li, Wenrui Dai, Hongkai Xiong, Qi Tian

Figure 1 for AiluRus: A Scalable ViT Framework for Dense Prediction
Figure 2 for AiluRus: A Scalable ViT Framework for Dense Prediction
Figure 3 for AiluRus: A Scalable ViT Framework for Dense Prediction
Figure 4 for AiluRus: A Scalable ViT Framework for Dense Prediction
Viaarxiv icon

Generative Pre-training for Speech with Flow Matching

Add code
Bookmark button
Alert button
Oct 25, 2023
Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu

Viaarxiv icon

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Add code
Bookmark button
Alert button
Sep 05, 2023
Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan

Figure 1 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 2 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 3 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 4 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Viaarxiv icon

Toward American Sign Language Processing in the Real World: Data, Tasks, and Methods

Add code
Bookmark button
Alert button
Aug 23, 2023
Bowen Shi

Figure 1 for Toward American Sign Language Processing in the Real World: Data, Tasks, and Methods
Figure 2 for Toward American Sign Language Processing in the Real World: Data, Tasks, and Methods
Figure 3 for Toward American Sign Language Processing in the Real World: Data, Tasks, and Methods
Figure 4 for Toward American Sign Language Processing in the Real World: Data, Tasks, and Methods
Viaarxiv icon

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

Add code
Bookmark button
Alert button
Aug 10, 2023
Tu Anh Nguyen, Wei-Ning Hsu, Antony D'Avirro, Bowen Shi, Itai Gat, Maryam Fazel-Zarani, Tal Remez, Jade Copet, Gabriel Synnaeve, Michael Hassid, Felix Kreuk, Yossi Adi, Emmanuel Dupoux

Figure 1 for EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Figure 2 for EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Figure 3 for EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Figure 4 for EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Viaarxiv icon