Picture for Chenyang Si

Chenyang Si

One Size, Many Fits: Aligning Diverse Group-Wise Click Preferences in Large-Scale Advertising Image Generation

Add code
Feb 02, 2026
Viaarxiv icon

StableWorld: Towards Stable and Consistent Long Interactive Video Generation

Add code
Jan 21, 2026
Viaarxiv icon

ProGuard: Towards Proactive Multimodal Safeguard

Add code
Dec 29, 2025
Viaarxiv icon

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Add code
Dec 15, 2025
Viaarxiv icon

SafeRBench: A Comprehensive Benchmark for Safety Assessment in Large Reasoning Models

Add code
Nov 19, 2025
Viaarxiv icon

RealDPO: Real or Not Real, that is the Preference

Add code
Oct 16, 2025
Figure 1 for RealDPO: Real or Not Real, that is the Preference
Figure 2 for RealDPO: Real or Not Real, that is the Preference
Figure 3 for RealDPO: Real or Not Real, that is the Preference
Figure 4 for RealDPO: Real or Not Real, that is the Preference
Viaarxiv icon

FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

Add code
Jul 02, 2025
Figure 1 for FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Figure 2 for FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Figure 3 for FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Figure 4 for FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Viaarxiv icon

Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Add code
Jun 09, 2025
Figure 1 for Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Figure 2 for Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Figure 3 for Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Figure 4 for Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Viaarxiv icon

V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning

Add code
Mar 14, 2025
Figure 1 for V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
Figure 2 for V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
Figure 3 for V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
Figure 4 for V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
Viaarxiv icon

Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT

Add code
Feb 10, 2025
Viaarxiv icon