Alert button

"Image": models, code, and papers
Alert button

You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

Add code
Bookmark button
Alert button
Mar 20, 2024
Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

Figure 1 for You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Figure 2 for You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Figure 3 for You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Figure 4 for You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Viaarxiv icon

TraveLER: A Multi-LMM Agent Framework for Video Question-Answering

Apr 01, 2024
Chuyi Shang, Amos You, Sanjay Subramanian, Trevor Darrell, Roei Herzig

Viaarxiv icon

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

Add code
Bookmark button
Alert button
Apr 01, 2024
Weifeng Lin, Xinyu Wei, Ruichuan An, Peng Gao, Bocheng Zou, Yulin Luo, Siyuan Huang, Shanghang Zhang, Hongsheng Li

Figure 1 for Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Figure 2 for Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Figure 3 for Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Figure 4 for Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Viaarxiv icon

FairRAG: Fair Human Generation via Fair Retrieval Augmentation

Mar 29, 2024
Robik Shrestha, Yang Zou, Qiuyu Chen, Zhiheng Li, Yusheng Xie, Siqi Deng

Figure 1 for FairRAG: Fair Human Generation via Fair Retrieval Augmentation
Figure 2 for FairRAG: Fair Human Generation via Fair Retrieval Augmentation
Figure 3 for FairRAG: Fair Human Generation via Fair Retrieval Augmentation
Figure 4 for FairRAG: Fair Human Generation via Fair Retrieval Augmentation
Viaarxiv icon

Relational Representation Learning Network for Cross-Spectral Image Patch Matching

Mar 18, 2024
Chuang Yu, Yunpeng Liu, Jinmiao Zhao, Dou Quan, Zelin Shi

Figure 1 for Relational Representation Learning Network for Cross-Spectral Image Patch Matching
Figure 2 for Relational Representation Learning Network for Cross-Spectral Image Patch Matching
Viaarxiv icon

ImageNot: A contrast with ImageNet preserves model rankings

Apr 02, 2024
Olawale Salaudeen, Moritz Hardt

Viaarxiv icon

Quantifying Noise of Dynamic Vision Sensor

Apr 02, 2024
Evgeny V. Votyakov, Alessandro Artusi

Viaarxiv icon

Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of Adverbs

Apr 02, 2024
Faraz Lotfi, Farnoosh Faraji, Nikhil Kakodkar, Travis Manderson, David Meger, Gregory Dudek

Viaarxiv icon

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Apr 03, 2024
Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang YU, Jiayuan Fan

Viaarxiv icon

SalFoM: Dynamic Saliency Prediction with Video Foundation Models

Apr 03, 2024
Morteza Moradi, Mohammad Moradi, Francesco Rundo, Concetto Spampinato, Ali Borji, Simone Palazzo

Viaarxiv icon