Picture for Mohan Zhou

Mohan Zhou

V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation

Add code
Mar 10, 2025
Viaarxiv icon

STAR: Scale-wise Text-to-image generation via Auto-Regressive representations

Add code
Jun 16, 2024
Figure 1 for STAR: Scale-wise Text-to-image generation via Auto-Regressive representations
Figure 2 for STAR: Scale-wise Text-to-image generation via Auto-Regressive representations
Figure 3 for STAR: Scale-wise Text-to-image generation via Auto-Regressive representations
Figure 4 for STAR: Scale-wise Text-to-image generation via Auto-Regressive representations
Viaarxiv icon

StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models

Add code
Jan 25, 2024
Figure 1 for StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models
Figure 2 for StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models
Figure 3 for StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models
Figure 4 for StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models
Viaarxiv icon

Learning and Evaluating Human Preferences for Conversational Head Generation

Add code
Aug 02, 2023
Figure 1 for Learning and Evaluating Human Preferences for Conversational Head Generation
Figure 2 for Learning and Evaluating Human Preferences for Conversational Head Generation
Figure 3 for Learning and Evaluating Human Preferences for Conversational Head Generation
Figure 4 for Learning and Evaluating Human Preferences for Conversational Head Generation
Viaarxiv icon

Interactive Conversational Head Generation

Add code
Jul 05, 2023
Figure 1 for Interactive Conversational Head Generation
Figure 2 for Interactive Conversational Head Generation
Figure 3 for Interactive Conversational Head Generation
Figure 4 for Interactive Conversational Head Generation
Viaarxiv icon

Visual-Aware Text-to-Speech

Add code
Jun 21, 2023
Figure 1 for Visual-Aware Text-to-Speech
Figure 2 for Visual-Aware Text-to-Speech
Figure 3 for Visual-Aware Text-to-Speech
Figure 4 for Visual-Aware Text-to-Speech
Viaarxiv icon

Responsive Listening Head Generation: A Benchmark Dataset and Baseline

Add code
Dec 27, 2021
Figure 1 for Responsive Listening Head Generation: A Benchmark Dataset and Baseline
Figure 2 for Responsive Listening Head Generation: A Benchmark Dataset and Baseline
Figure 3 for Responsive Listening Head Generation: A Benchmark Dataset and Baseline
Figure 4 for Responsive Listening Head Generation: A Benchmark Dataset and Baseline
Viaarxiv icon

Augmentation Pathways Network for Visual Recognition

Add code
Jul 26, 2021
Figure 1 for Augmentation Pathways Network for Visual Recognition
Figure 2 for Augmentation Pathways Network for Visual Recognition
Figure 3 for Augmentation Pathways Network for Visual Recognition
Figure 4 for Augmentation Pathways Network for Visual Recognition
Viaarxiv icon

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Add code
Mar 31, 2020
Figure 1 for Look-into-Object: Self-supervised Structure Modeling for Object Recognition
Figure 2 for Look-into-Object: Self-supervised Structure Modeling for Object Recognition
Figure 3 for Look-into-Object: Self-supervised Structure Modeling for Object Recognition
Figure 4 for Look-into-Object: Self-supervised Structure Modeling for Object Recognition
Viaarxiv icon