Story Visualization


Story visualization is the task of generating coherent and aligned sequence of images given a sequence of textual captions representing description of a story. It mainly consists of two tasks: story generation and story continuation, where story continuation uses additional ground-truth information in the form of the first frame.

MUSE: A Multi-agent Framework for Unconstrained Story Envisioning via Closed-Loop Cognitive Orchestration

Add code
Feb 03, 2026
Viaarxiv icon

ReDiStory: Region-Disentangled Diffusion for Consistent Visual Story Generation

Add code
Feb 01, 2026
Viaarxiv icon

Hierarchical Adaptive Eviction for KV Cache Management in Multimodal Language Models

Add code
Feb 02, 2026
Viaarxiv icon

StoryState: Agent-Based State Control for Consistent and Editable Storybooks

Add code
Feb 01, 2026
Viaarxiv icon

DeCorStory: Gram-Schmidt Prompt Embedding Decorrelation for Consistent Storytelling

Add code
Feb 01, 2026
Viaarxiv icon

Vidmento: Creating Video Stories Through Context-Aware Expansion With Generative Video

Add code
Jan 29, 2026
Viaarxiv icon

DiffusionCinema: Text-to-Aerial Cinematography

Add code
Jan 24, 2026
Viaarxiv icon

MPCI-Bench: A Benchmark for Multimodal Pairwise Contextual Integrity Evaluation of Language Model Agents

Add code
Jan 13, 2026
Viaarxiv icon

HiVid-Narrator: Hierarchical Video Narrative Generation with Scene-Primed ASR-anchored Compression

Add code
Jan 12, 2026
Viaarxiv icon

VideoMemory: Toward Consistent Video Generation via Memory Integration

Add code
Jan 07, 2026
Viaarxiv icon