Picture for Shengming Yin

Shengming Yin

Qwen-Image-VAE-2.0 Technical Report

Add code
May 13, 2026
Viaarxiv icon

Qwen-Image-2.0 Technical Report

Add code
May 11, 2026
Viaarxiv icon

Conflicts Make Large Reasoning Models Vulnerable to Attacks

Add code
Apr 10, 2026
Viaarxiv icon

Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition

Add code
Dec 17, 2025
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

EG4D: Explicit Generation of 4D Object without Score Distillation

Add code
May 28, 2024
Viaarxiv icon

Using Left and Right Brains Together: Towards Vision and Language Planning

Add code
Feb 16, 2024
Figure 1 for Using Left and Right Brains Together: Towards Vision and Language Planning
Figure 2 for Using Left and Right Brains Together: Towards Vision and Language Planning
Figure 3 for Using Left and Right Brains Together: Towards Vision and Language Planning
Figure 4 for Using Left and Right Brains Together: Towards Vision and Language Planning
Viaarxiv icon

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis

Add code
Jan 30, 2024
Viaarxiv icon

ORES: Open-vocabulary Responsible Visual Synthesis

Add code
Aug 26, 2023
Figure 1 for ORES: Open-vocabulary Responsible Visual Synthesis
Figure 2 for ORES: Open-vocabulary Responsible Visual Synthesis
Figure 3 for ORES: Open-vocabulary Responsible Visual Synthesis
Figure 4 for ORES: Open-vocabulary Responsible Visual Synthesis
Viaarxiv icon

DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory

Add code
Aug 16, 2023
Figure 1 for DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
Figure 2 for DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
Figure 3 for DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
Figure 4 for DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
Viaarxiv icon