Picture for Weiming Ren

Weiming Ren

HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming

Add code
Dec 24, 2025
Figure 1 for HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
Figure 2 for HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
Figure 3 for HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
Figure 4 for HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
Viaarxiv icon

OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Add code
Dec 08, 2025
Figure 1 for OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Figure 2 for OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Figure 3 for OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Figure 4 for OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Viaarxiv icon

VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation

Add code
May 20, 2025
Viaarxiv icon

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Add code
Mar 14, 2025
Figure 1 for Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
Figure 2 for Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
Figure 3 for Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
Figure 4 for Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
Viaarxiv icon

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Add code
Dec 01, 2024
Figure 1 for VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
Figure 2 for VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
Figure 3 for VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
Figure 4 for VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
Viaarxiv icon

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Add code
Nov 11, 2024
Figure 1 for OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Figure 2 for OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Figure 3 for OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Figure 4 for OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Viaarxiv icon

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Add code
Jun 04, 2024
Figure 1 for MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Figure 2 for MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Figure 3 for MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Figure 4 for MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Viaarxiv icon

Video Diffusion Models: A Survey

Add code
May 06, 2024
Figure 1 for Video Diffusion Models: A Survey
Figure 2 for Video Diffusion Models: A Survey
Figure 3 for Video Diffusion Models: A Survey
Figure 4 for Video Diffusion Models: A Survey
Viaarxiv icon

AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks

Add code
Mar 22, 2024
Figure 1 for AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks
Figure 2 for AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks
Figure 3 for AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks
Figure 4 for AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks
Viaarxiv icon

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Add code
Feb 28, 2024
Viaarxiv icon