Alert button

"Image": models, code, and papers
Alert button

Text and Click inputs for unambiguous open vocabulary instance segmentation

Nov 24, 2023
Nikolai Warner, Meera Hahn, Jonathan Huang, Irfan Essa, Vighnesh Birodkar

Viaarxiv icon

Calibrated Uncertainties for Neural Radiance Fields

Dec 04, 2023
Niki Amini-Naieni, Tomas Jakab, Andrea Vedaldi, Ronald Clark

Viaarxiv icon

Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images

Dec 04, 2023
Zhuoran Yu, Chenchen Zhu, Sean Culatana, Raghuraman Krishnamoorthi, Fanyi Xiao, Yong Jae Lee

Viaarxiv icon

Fully Spiking Denoising Diffusion Implicit Models

Dec 04, 2023
Ryo Watanabe, Yusuke Mukuta, Tatsuya Harada

Viaarxiv icon

TextAug: Test time Text Augmentation for Multimodal Person Re-identification

Dec 04, 2023
Mulham Fawakherji, Eduard Vazquez, Pasquale Giampa, Binod Bhattarai

Figure 1 for TextAug: Test time Text Augmentation for Multimodal Person Re-identification
Figure 2 for TextAug: Test time Text Augmentation for Multimodal Person Re-identification
Figure 3 for TextAug: Test time Text Augmentation for Multimodal Person Re-identification
Figure 4 for TextAug: Test time Text Augmentation for Multimodal Person Re-identification
Viaarxiv icon

Coronary Atherosclerotic Plaque Characterization with Photon-counting CT: a Simulation-based Feasibility Study

Dec 04, 2023
Mengzhou Li, Mingye Wu, Jed Pack, Pengwei Wu, Bruno De Man, Adam Wang, Koen Nieman, Ge Wang

Viaarxiv icon

Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Add code
Bookmark button
Alert button
Dec 05, 2023
Zhuo Huang, Chang Liu, Yinpeng Dong, Hang Su, Shibao Zheng, Tongliang Liu

Figure 1 for Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Figure 2 for Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Figure 3 for Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Figure 4 for Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Viaarxiv icon

Learning to Generate Parameters of ConvNets for Unseen Image Data

Oct 24, 2023
Shiye Wang, Kaituo Feng, Changsheng Li, Ye Yuan, Guoren Wang

Viaarxiv icon

GeoChat: Grounded Large Vision-Language Model for Remote Sensing

Add code
Bookmark button
Alert button
Nov 24, 2023
Kartik Kuckreja, Muhammad Sohail Danish, Muzammal Naseer, Abhijit Das, Salman Khan, Fahad Shahbaz Khan

Figure 1 for GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Figure 2 for GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Figure 3 for GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Figure 4 for GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Viaarxiv icon

FuseNet: Self-Supervised Dual-Path Network for Medical Image Segmentation

Add code
Bookmark button
Alert button
Nov 22, 2023
Amirhossein Kazerouni, Sanaz Karimijafarbigloo, Reza Azad, Yury Velichko, Ulas Bagci, Dorit Merhof

Viaarxiv icon