Picture for Xiaoqian Shen

Xiaoqian Shen

Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling

Add code
Aug 07, 2024
Viaarxiv icon

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos

Add code
Jul 17, 2024
Viaarxiv icon

iMotion-LLM: Motion Prediction Instruction Tuning

Add code
Jun 11, 2024
Viaarxiv icon

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens

Add code
Apr 04, 2024
Viaarxiv icon

Large Language Models as Consistent Story Visualizers

Add code
Dec 04, 2023
Viaarxiv icon

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

Add code
Oct 26, 2023
Viaarxiv icon

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations

Add code
Sep 12, 2023
Viaarxiv icon

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models

Add code
Apr 20, 2023
Viaarxiv icon

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

Add code
Apr 11, 2023
Viaarxiv icon

MoStGAN-V: Video Generation with Temporal Motion Styles

Add code
Apr 05, 2023
Viaarxiv icon