Alert button

"Image": models, code, and papers
Alert button

Spatial-Temporal Decoupling Contrastive Learning for Skeleton-based Human Action Recognition

Dec 23, 2023
Shaojie Zhang, Jianqin Yin, Yonghao Dang

Viaarxiv icon

The Challenges of Image Generation Models in Generating Multi-Component Images

Nov 22, 2023
Tham Yik Foong, Shashank Kotyan, Po Yuan Mao, Danilo Vasconcellos Vargas

Viaarxiv icon

Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning

Nov 27, 2023
Huanjin Yao, Wenhao Wu, Zhiheng Li

Figure 1 for Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Figure 2 for Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Figure 3 for Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Figure 4 for Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Viaarxiv icon

Multispectral palmprint recognition based on three descriptors: LBP, Shift LBP, and Multi Shift LBP with LDA classifier

Dec 18, 2023
Salwua Aqreerah, Alhaam Alariyibi, Wafa El-Tarhouni

Viaarxiv icon

Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model

Dec 18, 2023
Decheng Liu, Xijun Wang, Chunlei Peng, Nannan Wang, Ruiming Hu, Xinbo Gao

Viaarxiv icon

Free-Editor: Zero-shot Text-driven 3D Scene Editing

Dec 21, 2023
Nazmul Karim, Umar Khalid, Hasan Iqbal, Jing Hua, Chen Chen

Viaarxiv icon

Testing the Segment Anything Model on radiology data

Dec 20, 2023
José Guilherme de Almeida, Nuno M. Rodrigues, Sara Silva, Nickolas Papanikolaou

Viaarxiv icon

A Somewhat Robust Image Watermark against Diffusion-based Editing Models

Nov 22, 2023
Mingtian Tan, Tianhao Wang, Somesh Jha

Viaarxiv icon

MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures

Nov 28, 2023
Zhuoyuan Wang, Jiacong Mi, Shan Lu, Jieyue He

Figure 1 for MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures
Figure 2 for MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures
Figure 3 for MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures
Figure 4 for MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures
Viaarxiv icon

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Dec 14, 2023
Jack Urbanek, Florian Bordes, Pietro Astolfi, Mary Williamson, Vasu Sharma, Adriana Romero-Soriano

Viaarxiv icon