Picture for Vishal M. Patel

Vishal M. Patel

Senior Member, IEEE

The Power of Context: How Multimodality Improves Image Super-Resolution

Add code
Mar 18, 2025
Figure 1 for The Power of Context: How Multimodality Improves Image Super-Resolution
Figure 2 for The Power of Context: How Multimodality Improves Image Super-Resolution
Figure 3 for The Power of Context: How Multimodality Improves Image Super-Resolution
Figure 4 for The Power of Context: How Multimodality Improves Image Super-Resolution
Viaarxiv icon

Lux Post Facto: Learning Portrait Performance Relighting with Conditional Video Diffusion and a Hybrid Dataset

Add code
Mar 18, 2025
Viaarxiv icon

Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning

Add code
Mar 10, 2025
Figure 1 for Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
Figure 2 for Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
Figure 3 for Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
Figure 4 for Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
Viaarxiv icon

$\mathsf{CSMAE~}$:~Cataract Surgical Masked Autoencoder (MAE) based Pre-training

Add code
Feb 12, 2025
Figure 1 for $\mathsf{CSMAE~}$:~Cataract Surgical Masked Autoencoder (MAE) based Pre-training
Figure 2 for $\mathsf{CSMAE~}$:~Cataract Surgical Masked Autoencoder (MAE) based Pre-training
Figure 3 for $\mathsf{CSMAE~}$:~Cataract Surgical Masked Autoencoder (MAE) based Pre-training
Figure 4 for $\mathsf{CSMAE~}$:~Cataract Surgical Masked Autoencoder (MAE) based Pre-training
Viaarxiv icon

Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models

Add code
Feb 11, 2025
Figure 1 for Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models
Figure 2 for Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models
Figure 3 for Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models
Figure 4 for Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models
Viaarxiv icon

FaceXBench: Evaluating Multimodal LLMs on Face Understanding

Add code
Jan 17, 2025
Viaarxiv icon

Distilling Multi-modal Large Language Models for Autonomous Driving

Add code
Jan 16, 2025
Figure 1 for Distilling Multi-modal Large Language Models for Autonomous Driving
Figure 2 for Distilling Multi-modal Large Language Models for Autonomous Driving
Figure 3 for Distilling Multi-modal Large Language Models for Autonomous Driving
Figure 4 for Distilling Multi-modal Large Language Models for Autonomous Driving
Viaarxiv icon

Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models

Add code
Jan 13, 2025
Figure 1 for Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models
Figure 2 for Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models
Figure 3 for Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models
Figure 4 for Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models
Viaarxiv icon

SegFace: Face Segmentation of Long-Tail Classes

Add code
Dec 11, 2024
Viaarxiv icon

PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition

Add code
Dec 10, 2024
Figure 1 for PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition
Figure 2 for PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition
Figure 3 for PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition
Figure 4 for PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition
Viaarxiv icon