Picture for Mohammad Mahdi Derakhshani

Mohammad Mahdi Derakhshani

Purrception: Variational Flow Matching for Vector-Quantized Image Generation

Add code
Oct 01, 2025
Figure 1 for Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Figure 2 for Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Figure 3 for Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Figure 4 for Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Viaarxiv icon

NeoBabel: A Multilingual Open Tower for Visual Generation

Add code
Jul 08, 2025
Viaarxiv icon

Continual Hyperbolic Learning of Instances and Classes

Add code
Jun 12, 2025
Figure 1 for Continual Hyperbolic Learning of Instances and Classes
Figure 2 for Continual Hyperbolic Learning of Instances and Classes
Figure 3 for Continual Hyperbolic Learning of Instances and Classes
Figure 4 for Continual Hyperbolic Learning of Instances and Classes
Viaarxiv icon

Learning to Ground VLMs without Forgetting

Add code
Oct 14, 2024
Figure 1 for Learning to Ground VLMs without Forgetting
Figure 2 for Learning to Ground VLMs without Forgetting
Figure 3 for Learning to Ground VLMs without Forgetting
Figure 4 for Learning to Ground VLMs without Forgetting
Viaarxiv icon

TULIP: Token-length Upgraded CLIP

Add code
Oct 13, 2024
Figure 1 for TULIP: Token-length Upgraded CLIP
Figure 2 for TULIP: Token-length Upgraded CLIP
Figure 3 for TULIP: Token-length Upgraded CLIP
Figure 4 for TULIP: Token-length Upgraded CLIP
Viaarxiv icon

Any-Shift Prompting for Generalization over Distributions

Add code
Feb 15, 2024
Viaarxiv icon

Unlocking Spatial Comprehension in Text-to-Image Diffusion Models

Add code
Nov 28, 2023
Figure 1 for Unlocking Spatial Comprehension in Text-to-Image Diffusion Models
Figure 2 for Unlocking Spatial Comprehension in Text-to-Image Diffusion Models
Figure 3 for Unlocking Spatial Comprehension in Text-to-Image Diffusion Models
Figure 4 for Unlocking Spatial Comprehension in Text-to-Image Diffusion Models
Viaarxiv icon

Small Visual Language Models can also be Open-Ended Few-Shot Learners

Add code
Sep 30, 2023
Viaarxiv icon

Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models

Add code
Mar 10, 2023
Figure 1 for Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models
Figure 2 for Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models
Figure 3 for Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models
Figure 4 for Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models
Viaarxiv icon

Variational prompt tuning improves generalization of vision-language models

Add code
Oct 05, 2022
Figure 1 for Variational prompt tuning improves generalization of vision-language models
Figure 2 for Variational prompt tuning improves generalization of vision-language models
Figure 3 for Variational prompt tuning improves generalization of vision-language models
Figure 4 for Variational prompt tuning improves generalization of vision-language models
Viaarxiv icon