Alert button

"Text": models, code, and papers
Alert button

SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-modal Intent Detection

Dec 31, 2023
Shijue Huang, Libo Qin, Bingbing Wang, Geng Tu, Ruifeng Xu

Viaarxiv icon

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Dec 12, 2023
Sicheng Mo, Fangzhou Mu, Kuan Heng Lin, Yanli Liu, Bochen Guan, Yin Li, Bolei Zhou

Viaarxiv icon

Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation

Dec 27, 2023
Wenting Chen, Linlin Shen, Xiang Li, Yixuan Yuan

Figure 1 for Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation
Figure 2 for Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation
Figure 3 for Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation
Figure 4 for Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation
Viaarxiv icon

IQAGPT: Image Quality Assessment with Vision-language and ChatGPT Models

Dec 25, 2023
Zhihao Chen, Bin Hu, Chuang Niu, Tao Chen, Yuxin Li, Hongming Shan, Ge Wang

Viaarxiv icon

Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval

Jan 01, 2024
Weihang Su, Qingyao Ai, Xiangsheng Li, Jia Chen, Yiqun Liu, Xiaolong Wu, Shengluan Hou

Figure 1 for Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval
Figure 2 for Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval
Figure 3 for Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval
Figure 4 for Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval
Viaarxiv icon

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

Dec 18, 2023
Han Liang, Jiacheng Bao, Ruichi Zhang, Sihan Ren, Yuecheng Xu, Sibei Yang, Xin Chen, Jingyi Yu, Lan Xu

Viaarxiv icon

Misalign, Contrast then Distill: Rethinking Misalignments in Language-Image Pretraining

Dec 19, 2023
Bumsoo Kim, Yeonsik Jo, Jinhyung Kim, Seung Hwan Kim

Viaarxiv icon

DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling

Nov 28, 2023
Linqi Zhou, Andy Shih, Chenlin Meng, Stefano Ermon

Viaarxiv icon

HiPA: Enabling One-Step Text-to-Image Diffusion Models via High-Frequency-Promoting Adaptation

Nov 30, 2023
Yifan Zhang, Bryan Hooi

Viaarxiv icon

Expediting Contrastive Language-Image Pretraining via Self-distilled Encoders

Dec 19, 2023
Bumsoo Kim, Jinhyung Kim, Yeonsik Jo, Seung Hwan Kim

Viaarxiv icon