Picture for Shaoteng Liu

Shaoteng Liu

Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

Add code
Dec 19, 2025
Viaarxiv icon

EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing

Add code
Dec 12, 2025
Viaarxiv icon

Training-Free Efficient Video Generation via Dynamic Token Carving

Add code
May 22, 2025
Figure 1 for Training-Free Efficient Video Generation via Dynamic Token Carving
Figure 2 for Training-Free Efficient Video Generation via Dynamic Token Carving
Figure 3 for Training-Free Efficient Video Generation via Dynamic Token Carving
Figure 4 for Training-Free Efficient Video Generation via Dynamic Token Carving
Viaarxiv icon

Generative Video Propagation

Add code
Dec 27, 2024
Figure 1 for Generative Video Propagation
Figure 2 for Generative Video Propagation
Figure 3 for Generative Video Propagation
Figure 4 for Generative Video Propagation
Viaarxiv icon

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Add code
Mar 27, 2024
Figure 1 for Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Figure 2 for Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Figure 3 for Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Figure 4 for Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Viaarxiv icon

RL-GPT: Integrating Reinforcement Learning and Code-as-policy

Add code
Feb 29, 2024
Figure 1 for RL-GPT: Integrating Reinforcement Learning and Code-as-policy
Figure 2 for RL-GPT: Integrating Reinforcement Learning and Code-as-policy
Figure 3 for RL-GPT: Integrating Reinforcement Learning and Code-as-policy
Figure 4 for RL-GPT: Integrating Reinforcement Learning and Code-as-policy
Viaarxiv icon

Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code

Add code
Oct 19, 2023
Figure 1 for Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code
Figure 2 for Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code
Figure 3 for Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code
Figure 4 for Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code
Viaarxiv icon

Self-supervised Learning by View Synthesis

Add code
Apr 22, 2023
Figure 1 for Self-supervised Learning by View Synthesis
Figure 2 for Self-supervised Learning by View Synthesis
Figure 3 for Self-supervised Learning by View Synthesis
Figure 4 for Self-supervised Learning by View Synthesis
Viaarxiv icon

Video-P2P: Video Editing with Cross-attention Control

Add code
Mar 08, 2023
Figure 1 for Video-P2P: Video Editing with Cross-attention Control
Figure 2 for Video-P2P: Video Editing with Cross-attention Control
Figure 3 for Video-P2P: Video Editing with Cross-attention Control
Figure 4 for Video-P2P: Video Editing with Cross-attention Control
Viaarxiv icon

Generative Model Watermarking Based on Human Visual System

Add code
Sep 30, 2022
Figure 1 for Generative Model Watermarking Based on Human Visual System
Figure 2 for Generative Model Watermarking Based on Human Visual System
Figure 3 for Generative Model Watermarking Based on Human Visual System
Figure 4 for Generative Model Watermarking Based on Human Visual System
Viaarxiv icon