Alert button

"Text": models, code, and papers
Alert button

Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing

Mar 04, 2024
Ling Yang, Zhilong Zhang, Zhaochen Yu, Jingwei Liu, Minkai Xu, Stefano Ermon, Bin Cui

Viaarxiv icon

NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging

Mar 06, 2024
Takahiro Shirakawa, Seiichi Uchida

Figure 1 for NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
Figure 2 for NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
Figure 3 for NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
Figure 4 for NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
Viaarxiv icon

Language Guided Exploration for RL Agents in Text Environments

Mar 05, 2024
Hitesh Golchha, Sahil Yerawar, Dhruvesh Patel, Soham Dan, Keerthiram Murugesan

Figure 1 for Language Guided Exploration for RL Agents in Text Environments
Figure 2 for Language Guided Exploration for RL Agents in Text Environments
Figure 3 for Language Guided Exploration for RL Agents in Text Environments
Figure 4 for Language Guided Exploration for RL Agents in Text Environments
Viaarxiv icon

UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

Mar 06, 2024
Xuweiyi Chen, Tian Xia, Sihan Xu

Figure 1 for UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
Figure 2 for UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
Figure 3 for UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
Figure 4 for UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
Viaarxiv icon

Lost in Overlap: Exploring Watermark Collision in LLMs

Mar 15, 2024
Yiyang Luo, Ke Lin, Chao Gu

Viaarxiv icon

Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision

Mar 06, 2024
Yajie Liu, Pu Ge, Qingjie Liu, Di Huang

Figure 1 for Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision
Figure 2 for Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision
Figure 3 for Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision
Figure 4 for Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision
Viaarxiv icon

$\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations

Mar 07, 2024
Xiang Li, Kai Qiu, Jinglu Wang, Xiaohao Xu, Rita Singh, Kashu Yamazak, Hao Chen, Xiaonan Huang, Bhiksha Raj

Figure 1 for $\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Figure 2 for $\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Figure 3 for $\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Figure 4 for $\text{R}^2$-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Viaarxiv icon

Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Mar 08, 2024
Junyan Wang, Zhenhong Sun, Zhiyu Tan, Xuanbai Chen, Weihua Chen, Hao Li, Cheng Zhang, Yang Song

Figure 1 for Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Figure 2 for Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Figure 3 for Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Figure 4 for Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Viaarxiv icon

A Survey of AI-generated Text Forensic Systems: Detection, Attribution, and Characterization

Mar 02, 2024
Tharindu Kumarage, Garima Agrawal, Paras Sheth, Raha Moraffah, Aman Chadha, Joshua Garland, Huan Liu

Viaarxiv icon

Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing

Mar 06, 2024
Bingyan Liu, Chengyu Wang, Tingfeng Cao, Kui Jia, Jun Huang

Figure 1 for Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing
Figure 2 for Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing
Figure 3 for Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing
Figure 4 for Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing
Viaarxiv icon