Picture for Leonid Karlinsky

Leonid Karlinsky

Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers

Add code
Jul 06, 2023
Figure 1 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 2 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 3 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 4 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Viaarxiv icon

Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models

Add code
Jun 01, 2023
Figure 1 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 2 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 3 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 4 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Viaarxiv icon

LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

Add code
May 29, 2023
Viaarxiv icon

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Add code
May 21, 2023
Viaarxiv icon

Listen, Think, and Understand

Add code
May 18, 2023
Viaarxiv icon

Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs

Add code
May 10, 2023
Figure 1 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 2 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 3 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 4 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Viaarxiv icon

Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies

Add code
Apr 08, 2023
Viaarxiv icon

Going Beyond Nouns With Vision & Language Models Using Synthetic Data

Add code
Mar 30, 2023
Figure 1 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 2 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 3 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 4 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Viaarxiv icon

MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

Add code
Mar 15, 2023
Viaarxiv icon

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

Add code
Mar 06, 2023
Viaarxiv icon