Alert button
Picture for Xiaoshi Wu

Xiaoshi Wu

Alert button

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Add code
Bookmark button
Alert button
Apr 04, 2024
Dongzhi Jiang, Guanglu Song, Xiaoshi Wu, Renrui Zhang, Dazhong Shen, Zhuofan Zong, Yu Liu, Hongsheng Li

Viaarxiv icon

ECNet: Effective Controllable Text-to-Image Diffusion Models

Add code
Bookmark button
Alert button
Mar 27, 2024
Sicheng Li, Keqiang Sun, Zhixin Lai, Xiaoshi Wu, Feng Qiu, Haoran Xie, Kazunori Miyata, Hongsheng Li

Viaarxiv icon

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

Add code
Bookmark button
Alert button
Mar 20, 2024
Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Li

Figure 1 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 2 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 3 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Figure 4 for Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Viaarxiv icon

JourneyDB: A Benchmark for Generative Image Understanding

Add code
Bookmark button
Alert button
Jul 03, 2023
Junting Pan, Keqiang Sun, Yuying Ge, Hao Li, Haodong Duan, Xiaoshi Wu, Renrui Zhang, Aojun Zhou, Zipeng Qin, Yi Wang, Jifeng Dai, Yu Qiao, Hongsheng Li

Figure 1 for JourneyDB: A Benchmark for Generative Image Understanding
Figure 2 for JourneyDB: A Benchmark for Generative Image Understanding
Figure 3 for JourneyDB: A Benchmark for Generative Image Understanding
Figure 4 for JourneyDB: A Benchmark for Generative Image Understanding
Viaarxiv icon

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Add code
Bookmark button
Alert button
Jun 15, 2023
Xiaoshi Wu, Yiming Hao, Keqiang Sun, Yixiong Chen, Feng Zhu, Rui Zhao, Hongsheng Li

Figure 1 for Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Figure 2 for Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Figure 3 for Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Figure 4 for Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Viaarxiv icon

Better Aligning Text-to-Image Models with Human Preference

Add code
Bookmark button
Alert button
Mar 25, 2023
Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, Hongsheng Li

Figure 1 for Better Aligning Text-to-Image Models with Human Preference
Figure 2 for Better Aligning Text-to-Image Models with Human Preference
Figure 3 for Better Aligning Text-to-Image Models with Human Preference
Figure 4 for Better Aligning Text-to-Image Models with Human Preference
Viaarxiv icon

CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching

Add code
Bookmark button
Alert button
Mar 23, 2023
Xiaoshi Wu, Feng Zhu, Rui Zhao, Hongsheng Li

Figure 1 for CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Figure 2 for CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Figure 3 for CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Figure 4 for CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Viaarxiv icon

Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks

Add code
Bookmark button
Alert button
Dec 02, 2021
Xizhou Zhu, Jinguo Zhu, Hao Li, Xiaoshi Wu, Xiaogang Wang, Hongsheng Li, Xiaohua Wang, Jifeng Dai

Figure 1 for Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks
Figure 2 for Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks
Figure 3 for Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks
Figure 4 for Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks
Viaarxiv icon

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision

Add code
Bookmark button
Alert button
Aug 12, 2021
Xiaoshi Wu, Hadar Averbuch-Elor, Jin Sun, Noah Snavely

Figure 1 for Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision
Figure 2 for Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision
Figure 3 for Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision
Figure 4 for Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision
Viaarxiv icon