Alert button

"Image": models, code, and papers
Alert button

PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting

Jul 14, 2023
Zixin Guo, Tzu-Jui Julius Wang, Selen Pehlivan, Abduljalil Radman, Jorma Laaksonen

Figure 1 for PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting
Figure 2 for PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting
Figure 3 for PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting
Figure 4 for PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting
Viaarxiv icon

DETR Doesn't Need Multi-Scale or Locality Design

Aug 03, 2023
Yutong Lin, Yuhui Yuan, Zheng Zhang, Chen Li, Nanning Zheng, Han Hu

Figure 1 for DETR Doesn't Need Multi-Scale or Locality Design
Figure 2 for DETR Doesn't Need Multi-Scale or Locality Design
Figure 3 for DETR Doesn't Need Multi-Scale or Locality Design
Figure 4 for DETR Doesn't Need Multi-Scale or Locality Design
Viaarxiv icon

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments

Jul 18, 2023
Spyros Gidaris, Andrei Bursuc, Oriane Simeoni, Antonin Vobecky, Nikos Komodakis, Matthieu Cord, Patrick Pérez

Figure 1 for MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
Figure 2 for MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
Figure 3 for MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
Figure 4 for MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
Viaarxiv icon

Ill-Posed Image Reconstruction Without an Image Prior

Apr 12, 2023
Oscar Leong, Angela F. Gao, He Sun, Katherine L. Bouman

Figure 1 for Ill-Posed Image Reconstruction Without an Image Prior
Figure 2 for Ill-Posed Image Reconstruction Without an Image Prior
Figure 3 for Ill-Posed Image Reconstruction Without an Image Prior
Figure 4 for Ill-Posed Image Reconstruction Without an Image Prior
Viaarxiv icon

Towards Viewpoint-Invariant Visual Recognition via Adversarial Training

Jul 16, 2023
Shouwei Ruan, Yinpeng Dong, Hang Su, Jianteng Peng, Ning Chen, Xingxing Wei

Figure 1 for Towards Viewpoint-Invariant Visual Recognition via Adversarial Training
Figure 2 for Towards Viewpoint-Invariant Visual Recognition via Adversarial Training
Figure 3 for Towards Viewpoint-Invariant Visual Recognition via Adversarial Training
Figure 4 for Towards Viewpoint-Invariant Visual Recognition via Adversarial Training
Viaarxiv icon

SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension

Aug 02, 2023
Bohao Li, Rui Wang, Guangzhi Wang, Yuying Ge, Yixiao Ge, Ying Shan

Figure 1 for SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension
Figure 2 for SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension
Figure 3 for SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension
Figure 4 for SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension
Viaarxiv icon

Incorporating Season and Solar Specificity into Renderings made by a NeRF Architecture using Satellite Images

Aug 02, 2023
Michael Gableman, Avinash Kak

Viaarxiv icon

Push the Boundary of SAM: A Pseudo-label Correction Framework for Medical Segmentation

Aug 02, 2023
Ziyi Huang, Hongshan Liu, Haofeng Zhang, Fuyong Xing, Andrew Laine, Elsa Angelini, Christine Hendon, Yu Gan

Figure 1 for Push the Boundary of SAM: A Pseudo-label Correction Framework for Medical Segmentation
Figure 2 for Push the Boundary of SAM: A Pseudo-label Correction Framework for Medical Segmentation
Figure 3 for Push the Boundary of SAM: A Pseudo-label Correction Framework for Medical Segmentation
Figure 4 for Push the Boundary of SAM: A Pseudo-label Correction Framework for Medical Segmentation
Viaarxiv icon

Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation

Aug 02, 2023
Quan Tang, Bowen Zhang, Jiajun Liu, Fagiu Liu, Yifan Liu

Figure 1 for Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation
Figure 2 for Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation
Figure 3 for Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation
Figure 4 for Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation
Viaarxiv icon

Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity

May 14, 2023
Raman Dutt, Linus Ericsson, Pedro Sanchez, Sotirios A. Tsaftaris, Timothy Hospedales

Figure 1 for Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity
Figure 2 for Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity
Figure 3 for Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity
Figure 4 for Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity
Viaarxiv icon