Alert button

"Text": models, code, and papers
Alert button

DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

Dec 21, 2023
Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge

Viaarxiv icon

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

Jan 03, 2024
Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Semin Kim, Joun Yeop Lee, Nam Soo Kim

Viaarxiv icon

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Jan 17, 2024
Jonghyun Lee, Hansam Cho, Youngjoon Yoo, Seoung Bum Kim, Yonghyun Jeong

Viaarxiv icon

Diffusion-based Blind Text Image Super-Resolution

Dec 13, 2023
Yuzhe Zhang, Jiawei Zhang, Hao Li, Zhouxia Wang, Luwei Hou, Dongqing Zou, Liheng Bian

Viaarxiv icon

FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection

Dec 14, 2023
Hongsuk Choi, Isaac Kasahara, Selim Engin, Moritz Graule, Nikhil Chavan-Dafle, Volkan Isler

Viaarxiv icon

Should ChatGPT Write Your Breakup Text? Exploring the Role of AI in Relationship Dissolution

Jan 18, 2024
Yue Fu, Yixin Chen, Zelia Gomes Da Costa Lai, Alexis Hiniker

Viaarxiv icon

Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

Dec 28, 2023
Haoning Wu, Zicheng Zhang, Weixia Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Yixuan Gao, Annan Wang, Erli Zhang, Wenxiu Sun, Qiong Yan, Xiongkuo Min, Guangtao Zhai, Weisi Lin

Viaarxiv icon

Question-Answer Cross Language Image Matching for Weakly Supervised Semantic Segmentation

Jan 18, 2024
Songhe Deng, Wei Zhuo, Jinheng Xie, Linlin Shen

Viaarxiv icon

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

Jan 01, 2024
Kaibin Tian, Yanhua Cheng, Yi Liu, Xinglin Hou, Quan Chen, Han Li

Viaarxiv icon

Excuse me, sir? Your language model is leaking (information)

Jan 18, 2024
Or Zamir

Viaarxiv icon