Alert button
Picture for Leonid Karlinsky

Leonid Karlinsky

Alert button

Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers

Add code
Bookmark button
Alert button
Jul 06, 2023
Yuan Gong, Sameer Khurana, Leonid Karlinsky, James Glass

Figure 1 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 2 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 3 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 4 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Viaarxiv icon

Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models

Add code
Bookmark button
Alert button
Jun 01, 2023
Sivan Doveh, Assaf Arbelle, Sivan Harary, Roei Herzig, Donghyun Kim, Paola Cascante-bonilla, Amit Alfassy, Rameswar Panda, Raja Giryes, Rogerio Feris, Shimon Ullman, Leonid Karlinsky

Figure 1 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 2 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 3 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 4 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Viaarxiv icon

LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

Add code
Bookmark button
Alert button
May 29, 2023
M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Mateusz Kozinski, Horst Possegger, Rogerio Feris, Horst Bischof

Figure 1 for LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
Figure 2 for LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
Figure 3 for LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
Figure 4 for LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
Viaarxiv icon

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Add code
Bookmark button
Alert button
May 21, 2023
Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogerio Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James Glass

Figure 1 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 2 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 3 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Viaarxiv icon

Listen, Think, and Understand

Add code
Bookmark button
Alert button
May 18, 2023
Yuan Gong, Hongyin Luo, Alexander H. Liu, Leonid Karlinsky, James Glass

Figure 1 for Listen, Think, and Understand
Figure 2 for Listen, Think, and Understand
Figure 3 for Listen, Think, and Understand
Figure 4 for Listen, Think, and Understand
Viaarxiv icon

Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs

Add code
Bookmark button
Alert button
May 10, 2023
Roei Herzig, Alon Mendelson, Leonid Karlinsky, Assaf Arbelle, Rogerio Feris, Trevor Darrell, Amir Globerson

Figure 1 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 2 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 3 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 4 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Viaarxiv icon

Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies

Add code
Bookmark button
Alert button
Apr 08, 2023
Ligong Han, Seungwook Han, Shivchander Sudalairaj, Charlotte Loh, Rumen Dangovski, Fei Deng, Pulkit Agrawal, Dimitris Metaxas, Leonid Karlinsky, Tsui-Wei Weng, Akash Srivastava

Figure 1 for Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies
Figure 2 for Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies
Figure 3 for Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies
Figure 4 for Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies
Viaarxiv icon

Going Beyond Nouns With Vision & Language Models Using Synthetic Data

Add code
Bookmark button
Alert button
Mar 30, 2023
Paola Cascante-Bonilla, Khaled Shehada, James Seale Smith, Sivan Doveh, Donghyun Kim, Rameswar Panda, Gül Varol, Aude Oliva, Vicente Ordonez, Rogerio Feris, Leonid Karlinsky

Figure 1 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 2 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 3 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 4 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Viaarxiv icon

MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

Add code
Bookmark button
Alert button
Mar 15, 2023
Wei Lin, Leonid Karlinsky, Nina Shvetsova, Horst Possegger, Mateusz Kozinski, Rameswar Panda, Rogerio Feris, Hilde Kuehne, Horst Bischof

Figure 1 for MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Figure 2 for MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Figure 3 for MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Figure 4 for MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Viaarxiv icon