Picture for Shang-Wen Li

Shang-Wen Li

Demystifying CLIP Data

Add code
Oct 02, 2023
Figure 1 for Demystifying CLIP Data
Figure 2 for Demystifying CLIP Data
Figure 3 for Demystifying CLIP Data
Figure 4 for Demystifying CLIP Data
Viaarxiv icon

AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models

Add code
Sep 19, 2023
Figure 1 for AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Figure 2 for AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Figure 3 for AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Viaarxiv icon

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Add code
Sep 05, 2023
Figure 1 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 2 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 3 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 4 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Viaarxiv icon

Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target

Add code
May 29, 2023
Viaarxiv icon

Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering

Add code
May 26, 2023
Viaarxiv icon

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode

Add code
May 19, 2023
Figure 1 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 2 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 3 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 4 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Viaarxiv icon

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

Add code
May 18, 2023
Figure 1 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Figure 2 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Viaarxiv icon

DINOv2: Learning Robust Visual Features without Supervision

Add code
Apr 14, 2023
Figure 1 for DINOv2: Learning Robust Visual Features without Supervision
Figure 2 for DINOv2: Learning Robust Visual Features without Supervision
Figure 3 for DINOv2: Learning Robust Visual Features without Supervision
Figure 4 for DINOv2: Learning Robust Visual Features without Supervision
Viaarxiv icon

SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks

Add code
Mar 01, 2023
Figure 1 for SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Figure 2 for SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Figure 3 for SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Figure 4 for SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Viaarxiv icon

MAViL: Masked Audio-Video Learners

Add code
Dec 15, 2022
Figure 1 for MAViL: Masked Audio-Video Learners
Figure 2 for MAViL: Masked Audio-Video Learners
Figure 3 for MAViL: Masked Audio-Video Learners
Figure 4 for MAViL: Masked Audio-Video Learners
Viaarxiv icon