Alert button

"Image": models, code, and papers
Alert button

ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes

Add code
Bookmark button
Alert button
Mar 15, 2024
Hashmat Shadab Malik, Muhammad Huzaifa, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan

Figure 1 for ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes
Figure 2 for ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes
Figure 3 for ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes
Figure 4 for ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes
Viaarxiv icon

Robust NAS under adversarial training: benchmark, theory, and beyond

Mar 19, 2024
Yongtao Wu, Fanghui Liu, Carl-Johann Simon-Gabriel, Grigorios G Chrysos, Volkan Cevher

Figure 1 for Robust NAS under adversarial training: benchmark, theory, and beyond
Figure 2 for Robust NAS under adversarial training: benchmark, theory, and beyond
Figure 3 for Robust NAS under adversarial training: benchmark, theory, and beyond
Figure 4 for Robust NAS under adversarial training: benchmark, theory, and beyond
Viaarxiv icon

Structure Similarity Preservation Learning for Asymmetric Image Retrieval

Add code
Bookmark button
Alert button
Mar 01, 2024
Hui Wu, Min Wang, Wengang Zhou, Houqiang Li

Figure 1 for Structure Similarity Preservation Learning for Asymmetric Image Retrieval
Figure 2 for Structure Similarity Preservation Learning for Asymmetric Image Retrieval
Figure 3 for Structure Similarity Preservation Learning for Asymmetric Image Retrieval
Figure 4 for Structure Similarity Preservation Learning for Asymmetric Image Retrieval
Viaarxiv icon

Adversarial Testing for Visual Grounding via Image-Aware Property Reduction

Add code
Bookmark button
Alert button
Mar 02, 2024
Zhiyuan Chang, Mingyang Li, Junjie Wang, Cheng Li, Boyu Wu, Fanjiang Xu, Qing Wang

Figure 1 for Adversarial Testing for Visual Grounding via Image-Aware Property Reduction
Figure 2 for Adversarial Testing for Visual Grounding via Image-Aware Property Reduction
Figure 3 for Adversarial Testing for Visual Grounding via Image-Aware Property Reduction
Figure 4 for Adversarial Testing for Visual Grounding via Image-Aware Property Reduction
Viaarxiv icon

Anatomical Structure-Guided Medical Vision-Language Pre-training

Mar 14, 2024
Qingqiu Li, Xiaohan Yan, Jilan Xu, Runtian Yuan, Yuejie Zhang, Rui Feng, Quanli Shen, Xiaobo Zhang, Shujun Wang

Figure 1 for Anatomical Structure-Guided Medical Vision-Language Pre-training
Figure 2 for Anatomical Structure-Guided Medical Vision-Language Pre-training
Figure 3 for Anatomical Structure-Guided Medical Vision-Language Pre-training
Figure 4 for Anatomical Structure-Guided Medical Vision-Language Pre-training
Viaarxiv icon

Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion

Add code
Bookmark button
Alert button
Mar 14, 2024
Dongyang Li, Chen Wei, Shiying Li, Jiachen Zou, Quanying Liu

Figure 1 for Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion
Figure 2 for Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion
Figure 3 for Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion
Figure 4 for Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion
Viaarxiv icon

Artifact Feature Purification for Cross-domain Detection of AI-generated Images

Mar 17, 2024
Zheling Meng, Bo Peng, Jing Dong, Tieniu Tan

Figure 1 for Artifact Feature Purification for Cross-domain Detection of AI-generated Images
Figure 2 for Artifact Feature Purification for Cross-domain Detection of AI-generated Images
Figure 3 for Artifact Feature Purification for Cross-domain Detection of AI-generated Images
Figure 4 for Artifact Feature Purification for Cross-domain Detection of AI-generated Images
Viaarxiv icon

Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans

Add code
Bookmark button
Alert button
Mar 17, 2024
Fares Bougourzi, Feryal Windal Moula, Halim Benhabiles, Fadi Dornaika, Abdelmalik Taleb-Ahmed

Figure 1 for Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans
Figure 2 for Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans
Figure 3 for Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans
Figure 4 for Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans
Viaarxiv icon

Image-Guided Autonomous Guidewire Navigation in Robot-Assisted Endovascular Interventions using Reinforcement Learning

Mar 09, 2024
Wentao Liu, Tong Tian, Weijin Xu, Bowen Liang, Qingsheng Lu, Xipeng Pan, Wenyi Zhao, Huihua Yang, Ruisheng Su

Figure 1 for Image-Guided Autonomous Guidewire Navigation in Robot-Assisted Endovascular Interventions using Reinforcement Learning
Figure 2 for Image-Guided Autonomous Guidewire Navigation in Robot-Assisted Endovascular Interventions using Reinforcement Learning
Figure 3 for Image-Guided Autonomous Guidewire Navigation in Robot-Assisted Endovascular Interventions using Reinforcement Learning
Figure 4 for Image-Guided Autonomous Guidewire Navigation in Robot-Assisted Endovascular Interventions using Reinforcement Learning
Viaarxiv icon

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Add code
Bookmark button
Alert button
Mar 11, 2024
Liang Chen, Haozhe Zhao, Tianyu Liu, Shuai Bai, Junyang Lin, Chang Zhou, Baobao Chang

Figure 1 for An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Figure 2 for An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Figure 3 for An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Figure 4 for An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Viaarxiv icon