Alert button

"Image": models, code, and papers
Alert button

Image Similarity using An Ensemble of Context-Sensitive Models

Jan 15, 2024
Zukang Liao, Min Chen

Viaarxiv icon

Assessing the Efficacy of Invisible Watermarks in AI-Generated Medical Images

Feb 08, 2024
Xiaodan Xing, Huiyu Zhou, Yingying Fang, Guang Yang

Viaarxiv icon

Enhancement of Bengali OCR by Specialized Models and Advanced Techniques for Diverse Document Types

Feb 07, 2024
AKM Shahariar Azad Rabby, Hasmot Ali, Md. Majedul Islam, Sheikh Abujar, Fuad Rahman

Viaarxiv icon

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

Add code
Bookmark button
Alert button
Feb 06, 2024
Yang Jin, Zhicheng Sun, Kun Xu, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang Song, Kun Gai, Yadong Mu

Viaarxiv icon

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Feb 06, 2024
Quan Sun, Jinsheng Wang, Qiying Yu, Yufeng Cui, Fan Zhang, Xiaosong Zhang, Xinlong Wang

Viaarxiv icon

Multi-level Cross-modal Alignment for Image Clustering

Jan 22, 2024
Liping Qiu, Qin Zhang, Xiaojun Chen, Shaotian Cai

Viaarxiv icon

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Add code
Bookmark button
Alert button
Feb 13, 2024
Fei Deng, Qifei Wang, Wei Wei, Matthias Grundmann, Tingbo Hou

Viaarxiv icon

Continuous Piecewise-Affine Based Motion Model for Image Animation

Add code
Bookmark button
Alert button
Jan 17, 2024
Hexiang Wang, Fengqi Liu, Qianyu Zhou, Ran Yi, Xin Tan, Lizhuang Ma

Viaarxiv icon

Novel definition and quantitative analysis of branch structure with topological data analysis

Feb 12, 2024
Haruhisa Oda, Mayuko Kida, Yoichi Nakata, Hiroki Kurihara

Viaarxiv icon

Exploring Perceptual Limitation of Multimodal Large Language Models

Feb 12, 2024
Jiarui Zhang, Jinyi Hu, Mahyar Khayatkhoei, Filip Ilievski, Maosong Sun

Viaarxiv icon