Picture for Ashwin Swaminathan

Ashwin Swaminathan

NAVERO: Unlocking Fine-Grained Semantics for Video-Language Compositionality

Add code
Aug 18, 2024
Figure 1 for NAVERO: Unlocking Fine-Grained Semantics for Video-Language Compositionality
Figure 2 for NAVERO: Unlocking Fine-Grained Semantics for Video-Language Compositionality
Figure 3 for NAVERO: Unlocking Fine-Grained Semantics for Video-Language Compositionality
Figure 4 for NAVERO: Unlocking Fine-Grained Semantics for Video-Language Compositionality
Viaarxiv icon

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

Add code
Jun 12, 2024
Viaarxiv icon

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

Add code
May 08, 2024
Viaarxiv icon

Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model

Add code
Apr 28, 2024
Viaarxiv icon

Mixed-Query Transformer: A Unified Image Segmentation Architecture

Add code
Apr 06, 2024
Figure 1 for Mixed-Query Transformer: A Unified Image Segmentation Architecture
Figure 2 for Mixed-Query Transformer: A Unified Image Segmentation Architecture
Figure 3 for Mixed-Query Transformer: A Unified Image Segmentation Architecture
Figure 4 for Mixed-Query Transformer: A Unified Image Segmentation Architecture
Viaarxiv icon

On the Scalability of Diffusion-based Text-to-Image Generation

Add code
Apr 03, 2024
Viaarxiv icon

CPR: Retrieval Augmented Generation for Copyright Protection

Add code
Mar 27, 2024
Viaarxiv icon

Multi-Modal Hallucination Control by Visual Information Grounding

Add code
Mar 20, 2024
Viaarxiv icon

Fast Sparse View Guided NeRF Update for Object Reconfigurations

Add code
Mar 16, 2024
Viaarxiv icon

A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D

Add code
Feb 29, 2024
Viaarxiv icon