Picture for Alberto Baldrati

Alberto Baldrati

Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion

Add code
Feb 06, 2025
Figure 1 for Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Figure 2 for Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Figure 3 for Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Figure 4 for Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Viaarxiv icon

Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation

Add code
Jul 03, 2024
Figure 1 for Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Figure 2 for Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Figure 3 for Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Figure 4 for Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Viaarxiv icon

iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval

Add code
May 05, 2024
Viaarxiv icon

Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing

Add code
Mar 25, 2024
Viaarxiv icon

Mapping Memes to Words for Multimodal Hateful Meme Classification

Add code
Oct 12, 2023
Figure 1 for Mapping Memes to Words for Multimodal Hateful Meme Classification
Figure 2 for Mapping Memes to Words for Multimodal Hateful Meme Classification
Figure 3 for Mapping Memes to Words for Multimodal Hateful Meme Classification
Figure 4 for Mapping Memes to Words for Multimodal Hateful Meme Classification
Viaarxiv icon

Exploiting CLIP-based Multi-modal Approach for Artwork Classification and Retrieval

Add code
Sep 21, 2023
Viaarxiv icon

OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data

Add code
Sep 11, 2023
Viaarxiv icon

Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features

Add code
Aug 22, 2023
Viaarxiv icon

ECO: Ensembling Context Optimization for Vision-Language Models

Add code
Jul 26, 2023
Viaarxiv icon

LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On

Add code
May 22, 2023
Viaarxiv icon