Alert button

"Text": models, code, and papers
Alert button

Parrot Captions Teach CLIP to Spot Text

Dec 28, 2023
Yiqi Lin, Conghui He, Alex Jinpeng Wang, Bin Wang, Weijia Li, Mike Zheng Shou

Viaarxiv icon

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval

Jan 24, 2024
Siwei Wu, Yizhi Li, Kang Zhu, Ge Zhang, Yiming Liang, Kaijing Ma, Chenghao Xiao, Haoran Zhang, Bohao Yang, Wenhu Chen, Wenhao Huang, Noura Al Moubayed, Jie Fu, Chenghua Lin

Viaarxiv icon

Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks

Feb 01, 2024
Maan Qraitem, Nazia Tasnim, Kate Saenko, Bryan A. Plummer

Viaarxiv icon

Towards scalable robotic intervention of children with Autism Spectrum Disorder using LLMs

Feb 01, 2024
Ruchik Mishra, Karla Conn Welch

Viaarxiv icon

Diffusion Model Conditioning on Gaussian Mixture Model and Negative Gaussian Mixture Gradient

Feb 01, 2024
Weiguo Lu, Xuan Wu, Deng Ding, Jinqiao Duan, Jirong Zhuang, Gangnan Yuan

Viaarxiv icon

COA-GPT: Generative Pre-trained Transformers for Accelerated Course of Action Development in Military Operations

Feb 01, 2024
Vinicius G. Goecks, Nicholas Waytowich

Viaarxiv icon

GUESS:GradUally Enriching SyntheSis for Text-Driven Human Motion Generation

Jan 06, 2024
Xuehao Gao, Yang Yang, Zhenyu Xie, Shaoyi Du, Zhongqian Sun, Yang Wu

Viaarxiv icon

Semantic Forwarding for Next Generation Relay Networks

Jan 30, 2024
Enes Arda, Emrecan Kutay, Aylin Yener

Viaarxiv icon

Gazetteer-Enhanced Bangla Named Entity Recognition with BanglaBERT Semantic Embeddings K-Means-Infused CRF Model

Jan 30, 2024
Niloy Farhan, Saman Sarker Joy, Tafseer Binte Mannan, Farig Sadeque

Viaarxiv icon

Incoherent Probability Judgments in Large Language Models

Jan 30, 2024
Jian-Qiao Zhu, Thomas L. Griffiths

Viaarxiv icon