Alert button
Picture for Xiaoqian Shen

Xiaoqian Shen

Alert button

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens

Add code
Bookmark button
Alert button
Apr 04, 2024
Kirolos Ataallah, Xiaoqian Shen, Eslam Abdelrahman, Essam Sleiman, Deyao Zhu, Jian Ding, Mohamed Elhoseiny

Viaarxiv icon

Large Language Models as Consistent Story Visualizers

Add code
Bookmark button
Alert button
Dec 04, 2023
Xiaoqian Shen, Mohamed Elhoseiny

Figure 1 for Large Language Models as Consistent Story Visualizers
Figure 2 for Large Language Models as Consistent Story Visualizers
Figure 3 for Large Language Models as Consistent Story Visualizers
Figure 4 for Large Language Models as Consistent Story Visualizers
Viaarxiv icon

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

Add code
Bookmark button
Alert button
Oct 26, 2023
Jun Chen, Deyao Zhu, Xiaoqian Shen, Xiang Li, Zechun Liu, Pengchuan Zhang, Raghuraman Krishnamoorthi, Vikas Chandra, Yunyang Xiong, Mohamed Elhoseiny

Viaarxiv icon

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations

Add code
Bookmark button
Alert button
Sep 12, 2023
Kilichbek Haydarov, Xiaoqian Shen, Avinash Madasu, Mahmoud Salem, Li-Jia Li, Gamaleldin Elsayed, Mohamed Elhoseiny

Viaarxiv icon

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models

Add code
Bookmark button
Alert button
Apr 20, 2023
Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny

Figure 1 for MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Figure 2 for MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Figure 3 for MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Figure 4 for MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Viaarxiv icon

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

Add code
Bookmark button
Alert button
Apr 11, 2023
Eslam Mohamed Bakr, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, Mohamed Elhoseiny

Figure 1 for HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Figure 2 for HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Figure 3 for HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Figure 4 for HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Viaarxiv icon

MoStGAN-V: Video Generation with Temporal Motion Styles

Add code
Bookmark button
Alert button
Apr 05, 2023
Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny

Figure 1 for MoStGAN-V: Video Generation with Temporal Motion Styles
Figure 2 for MoStGAN-V: Video Generation with Temporal Motion Styles
Figure 3 for MoStGAN-V: Video Generation with Temporal Motion Styles
Figure 4 for MoStGAN-V: Video Generation with Temporal Motion Styles
Viaarxiv icon

ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions

Add code
Bookmark button
Alert button
Mar 12, 2023
Deyao Zhu, Jun Chen, Kilichbek Haydarov, Xiaoqian Shen, Wenxuan Zhang, Mohamed Elhoseiny

Figure 1 for ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
Figure 2 for ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
Figure 3 for ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
Figure 4 for ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
Viaarxiv icon