Picture for Yasser Abdelaziz Dahou Djilali

Yasser Abdelaziz Dahou Djilali

From Unimodal to Multimodal: Scaling up Projectors to Align Modalities

Add code
Sep 28, 2024
Figure 1 for From Unimodal to Multimodal: Scaling up Projectors to Align Modalities
Figure 2 for From Unimodal to Multimodal: Scaling up Projectors to Align Modalities
Figure 3 for From Unimodal to Multimodal: Scaling up Projectors to Align Modalities
Figure 4 for From Unimodal to Multimodal: Scaling up Projectors to Align Modalities
Viaarxiv icon

Falcon2-11B Technical Report

Add code
Jul 20, 2024
Viaarxiv icon

ViSpeR: Multilingual Audio-Visual Speech Recognition

Add code
May 27, 2024
Viaarxiv icon

Do Vision and Language Encoders Represent the World Similarly?

Add code
Jan 10, 2024
Viaarxiv icon

Learning Saliency From Fixations

Add code
Nov 23, 2023
Viaarxiv icon

Do VSR Models Generalize Beyond LRS3?

Add code
Nov 23, 2023
Viaarxiv icon

Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping

Add code
Aug 11, 2023
Viaarxiv icon

One-Step Distributional Reinforcement Learning

Add code
Apr 27, 2023
Viaarxiv icon