Alert button

"Text": models, code, and papers
Alert button

InfoVisDial: An Informative Visual Dialogue Dataset by Bridging Large Multimodal and Language Models

Dec 21, 2023
Bingbing Wen, Zhengyuan Yang, Jianfeng Wang, Zhe Gan, Bill Howe, Lijuan Wang

Viaarxiv icon

Removing NSFW Concepts from Vision-and-Language Models for Text-to-Image Retrieval and Generation

Nov 27, 2023
Samuele Poppi, Tobia Poppi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara

Viaarxiv icon

BeautifulPrompt: Towards Automatic Prompt Engineering for Text-to-Image Synthesis

Nov 12, 2023
Tingfeng Cao, Chengyu Wang, Bingyan Liu, Ziheng Wu, Jinhui Zhu, Jun Huang

Viaarxiv icon

Panel Transitions for Genre Analysis in Visual Narratives

Dec 14, 2023
Yi-Chun Chen, Arnav Jhala

Viaarxiv icon

RDR: the Recap, Deliberate, and Respond Method for Enhanced Language Understanding

Dec 15, 2023
Yuxin Zi, Hariram Veeramani, Kaushik Roy, Amit Sheth

Viaarxiv icon

Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models

Dec 15, 2023
Senmao Li, Taihang Hu, Fahad Shahbaz Khan, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang

Figure 1 for Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Figure 2 for Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Figure 3 for Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Figure 4 for Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Viaarxiv icon

Muted: Multilingual Targeted Offensive Speech Identification and Visualization

Dec 18, 2023
Christoph Tillmann, Aashka Trivedi, Sara Rosenthal, Santosh Borse, Rong Zhang, Avirup Sil, Bishwaranjan Bhattacharjee

Viaarxiv icon

Learning Object State Changes in Videos: An Open-World Perspective

Dec 19, 2023
Zihui Xue, Kumar Ashutosh, Kristen Grauman

Viaarxiv icon

Explore Spurious Correlations at the Concept Level in Language Models for Text Classification

Nov 15, 2023
Yuhang Zhou, Paiheng Xu, Xiaoyu Liu, Bang An, Wei Ai, Furong Huang

Figure 1 for Explore Spurious Correlations at the Concept Level in Language Models for Text Classification
Figure 2 for Explore Spurious Correlations at the Concept Level in Language Models for Text Classification
Figure 3 for Explore Spurious Correlations at the Concept Level in Language Models for Text Classification
Figure 4 for Explore Spurious Correlations at the Concept Level in Language Models for Text Classification
Viaarxiv icon

Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design

Nov 19, 2023
Jia Yu, Lichao Zhang, Zijie Chen, Fayu Pan, MiaoMiao Wen, Yuming Yan, Fangsheng Weng, Shuai Zhang, Lili Pan, Zhenzhong Lan

Figure 1 for Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Figure 2 for Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Figure 3 for Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Figure 4 for Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Viaarxiv icon