Alert button
Picture for Shang-Wen Li

Shang-Wen Li

Alert button

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Add code
Bookmark button
Alert button
Sep 05, 2023
Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan

Figure 1 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 2 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 3 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 4 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Viaarxiv icon

Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target

Add code
Bookmark button
Alert button
May 29, 2023
Guan-Wei Wu, Guan-Ting Lin, Shang-Wen Li, Hung-yi Lee

Figure 1 for Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Figure 2 for Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Figure 3 for Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Figure 4 for Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Viaarxiv icon

Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering

Add code
Bookmark button
Alert button
May 26, 2023
Yung-Sung Chuang, Wei Fang, Shang-Wen Li, Wen-tau Yih, James Glass

Figure 1 for Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Figure 2 for Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Figure 3 for Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Figure 4 for Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Viaarxiv icon

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode

Add code
Bookmark button
Alert button
May 19, 2023
Puyuan Peng, Shang-Wen Li, Okko Räsänen, Abdelrahman Mohamed, David Harwath

Figure 1 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 2 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 3 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 4 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Viaarxiv icon

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

Add code
Bookmark button
Alert button
May 18, 2023
Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe

Figure 1 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Figure 2 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Figure 3 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Viaarxiv icon

DINOv2: Learning Robust Visual Features without Supervision

Add code
Bookmark button
Alert button
Apr 14, 2023
Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski

Figure 1 for DINOv2: Learning Robust Visual Features without Supervision
Figure 2 for DINOv2: Learning Robust Visual Features without Supervision
Figure 3 for DINOv2: Learning Robust Visual Features without Supervision
Figure 4 for DINOv2: Learning Robust Visual Features without Supervision
Viaarxiv icon

SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks

Add code
Bookmark button
Alert button
Mar 01, 2023
Kai-Wei Chang, Yu-Kai Wang, Hua Shen, Iu-thing Kang, Wei-Cheng Tseng, Shang-Wen Li, Hung-yi Lee

Figure 1 for SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Figure 2 for SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Figure 3 for SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Figure 4 for SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Viaarxiv icon

MAViL: Masked Audio-Video Learners

Add code
Bookmark button
Alert button
Dec 15, 2022
Po-Yao Huang, Vasu Sharma, Hu Xu, Chaitanya Ryali, Haoqi Fan, Yanghao Li, Shang-Wen Li, Gargi Ghosh, Jitendra Malik, Christoph Feichtenhofer

Figure 1 for MAViL: Masked Audio-Video Learners
Figure 2 for MAViL: Masked Audio-Video Learners
Figure 3 for MAViL: Masked Audio-Video Learners
Figure 4 for MAViL: Masked Audio-Video Learners
Viaarxiv icon

Introducing Semantics into Speech Encoders

Add code
Bookmark button
Alert button
Nov 15, 2022
Derek Xu, Shuyan Dong, Changhan Wang, Suyoun Kim, Zhaojiang Lin, Akshat Shrivastava, Shang-Wen Li, Liang-Hsuan Tseng, Alexei Baevski, Guan-Ting Lin, Hung-yi Lee, Yizhou Sun, Wei Wang

Figure 1 for Introducing Semantics into Speech Encoders
Figure 2 for Introducing Semantics into Speech Encoders
Figure 3 for Introducing Semantics into Speech Encoders
Figure 4 for Introducing Semantics into Speech Encoders
Viaarxiv icon

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

Add code
Bookmark button
Alert button
Oct 16, 2022
Tzu-hsun Feng, Annie Dong, Ching-Feng Yeh, Shu-wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee

Figure 1 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 2 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 3 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 4 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Viaarxiv icon