Picture for Jianlong Yuan

Jianlong Yuan

Frame-Level Captions for Long Video Generation with Complex Multi Scenes

Add code
May 27, 2025
Viaarxiv icon

Generative Pre-trained Autoregressive Diffusion Transformer

Add code
May 15, 2025
Viaarxiv icon

STORYANCHORS: Generating Consistent Multi-Scene Story Frames for Long-Form Narratives

Add code
May 13, 2025
Viaarxiv icon

Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Add code
Mar 25, 2025
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence

Add code
Jul 23, 2024
Figure 1 for MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence
Figure 2 for MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence
Figure 3 for MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence
Figure 4 for MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence
Viaarxiv icon

Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding

Add code
Sep 15, 2023
Figure 1 for Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding
Figure 2 for Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding
Figure 3 for Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding
Figure 4 for Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding
Viaarxiv icon

RSGPT: A Remote Sensing Vision Language Model and Benchmark

Add code
Jul 28, 2023
Figure 1 for RSGPT: A Remote Sensing Vision Language Model and Benchmark
Figure 2 for RSGPT: A Remote Sensing Vision Language Model and Benchmark
Figure 3 for RSGPT: A Remote Sensing Vision Language Model and Benchmark
Figure 4 for RSGPT: A Remote Sensing Vision Language Model and Benchmark
Viaarxiv icon

UniNeXt: Exploring A Unified Architecture for Vision Recognition

Add code
May 01, 2023
Viaarxiv icon

Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation

Add code
Feb 28, 2023
Viaarxiv icon