Alert button
Picture for Stefano Soatto

Stefano Soatto

Alert button

Mixed-Query Transformer: A Unified Image Segmentation Architecture

Add code
Bookmark button
Alert button
Apr 06, 2024
Pei Wang, Zhaowei Cai, Hao Yang, Ashwin Swaminathan, R. Manmatha, Stefano Soatto

Viaarxiv icon

WorDepth: Variational Language Prior for Monocular Depth Estimation

Add code
Bookmark button
Alert button
Apr 05, 2024
Ziyao Zeng, Daniel Wang, Fengyu Yang, Hyoungseob Park, Yangchao Wu, Stefano Soatto, Byung-Woo Hong, Dong Lao, Alex Wong

Viaarxiv icon

On the Scalability of Diffusion-based Text-to-Image Generation

Add code
Bookmark button
Alert button
Apr 03, 2024
Hao Li, Yang Zou, Ying Wang, Orchid Majumder, Yusheng Xie, R. Manmatha, Ashwin Swaminathan, Zhuowen Tu, Stefano Ermon, Stefano Soatto

Viaarxiv icon

Heat Death of Generative Models in Closed-Loop Learning

Add code
Bookmark button
Alert button
Apr 02, 2024
Matteo Marchi, Stefano Soatto, Pratik Chaudhari, Paulo Tabuada

Viaarxiv icon

CPR: Retrieval Augmented Generation for Copyright Protection

Add code
Bookmark button
Alert button
Mar 27, 2024
Aditya Golatkar, Alessandro Achille, Luca Zancato, Yu-Xiang Wang, Ashwin Swaminathan, Stefano Soatto

Viaarxiv icon

Multi-Modal Hallucination Control by Visual Information Grounding

Add code
Bookmark button
Alert button
Mar 20, 2024
Alessandro Favero, Luca Zancato, Matthew Trager, Siddharth Choudhary, Pramuditha Perera, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto

Figure 1 for Multi-Modal Hallucination Control by Visual Information Grounding
Figure 2 for Multi-Modal Hallucination Control by Visual Information Grounding
Figure 3 for Multi-Modal Hallucination Control by Visual Information Grounding
Figure 4 for Multi-Modal Hallucination Control by Visual Information Grounding
Viaarxiv icon

Fast Sparse View Guided NeRF Update for Object Reconfigurations

Add code
Bookmark button
Alert button
Mar 16, 2024
Ziqi Lu, Jianbo Ye, Xiaohan Fei, Xiaolong Li, Jiawei Mo, Ashwin Swaminathan, Stefano Soatto

Figure 1 for Fast Sparse View Guided NeRF Update for Object Reconfigurations
Figure 2 for Fast Sparse View Guided NeRF Update for Object Reconfigurations
Figure 3 for Fast Sparse View Guided NeRF Update for Object Reconfigurations
Figure 4 for Fast Sparse View Guided NeRF Update for Object Reconfigurations
Viaarxiv icon

Enhancing Vision-Language Pre-training with Rich Supervisions

Add code
Bookmark button
Alert button
Mar 05, 2024
Yuan Gao, Kunyu Shi, Pengkai Zhu, Edouard Belval, Oren Nuriel, Srikar Appalaraju, Shabnam Ghadar, Vijay Mahadevan, Zhuowen Tu, Stefano Soatto

Figure 1 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 2 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 3 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 4 for Enhancing Vision-Language Pre-training with Rich Supervisions
Viaarxiv icon

Non-autoregressive Sequence-to-Sequence Vision-Language Models

Add code
Bookmark button
Alert button
Mar 04, 2024
Kunyu Shi, Qi Dong, Luis Goncalves, Zhuowen Tu, Stefano Soatto

Figure 1 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 2 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 3 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Figure 4 for Non-autoregressive Sequence-to-Sequence Vision-Language Models
Viaarxiv icon

A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D

Add code
Bookmark button
Alert button
Feb 29, 2024
Xiaohan Fei, Chethan Parameshwara, Jiawei Mo, Xiaolong Li, Ashwin Swaminathan, CJ Taylor, Paolo Favaro, Stefano Soatto

Viaarxiv icon