Alert button
Picture for Gang Yu

Gang Yu

Alert button

GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting

Add code
Bookmark button
Alert button
Apr 28, 2024
Hongyun Yu, Zhan Qu, Qihang Yu, Jianchuan Chen, Zhonghua Jiang, Zhiwen Chen, Shengyu Zhang, Jimin Xu, Fei Wu, Chengfei Lv, Gang Yu

Viaarxiv icon

Generative Motion Stylization within Canonical Motion Space

Add code
Bookmark button
Alert button
Mar 18, 2024
Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang Tu

Figure 1 for Generative Motion Stylization within Canonical Motion Space
Figure 2 for Generative Motion Stylization within Canonical Motion Space
Figure 3 for Generative Motion Stylization within Canonical Motion Space
Figure 4 for Generative Motion Stylization within Canonical Motion Space
Viaarxiv icon

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Add code
Bookmark button
Alert button
Mar 08, 2024
Xiwei Hu, Rui Wang, Yixiao Fang, Bin Fu, Pei Cheng, Gang Yu

Figure 1 for ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Figure 2 for ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Figure 3 for ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Figure 4 for ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Viaarxiv icon

MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies

Add code
Bookmark button
Alert button
Mar 03, 2024
Zhende Song, Chenchen Wang, Jiamu Sheng, Chi Zhang, Gang Yu, Jiayuan Fan, Tao Chen

Figure 1 for MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Figure 2 for MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Figure 3 for MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Figure 4 for MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Viaarxiv icon

AppAgent: Multimodal Agents as Smartphone Users

Add code
Bookmark button
Alert button
Dec 22, 2023
Chi Zhang, Zhao Yang, Jiaxuan Liu, Yucheng Han, Xin Chen, Zebiao Huang, Bin Fu, Gang Yu

Viaarxiv icon

Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models

Add code
Bookmark button
Alert button
Dec 22, 2023
Xianfang Zeng, Xin Chen, Zhongqi Qi, Wen Liu, Zibo Zhao, Zhibin Wang, Bin Fu, Yong Liu, Gang Yu

Viaarxiv icon

M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts

Add code
Bookmark button
Alert button
Dec 17, 2023
Mingsheng Li, Xin Chen, Chi Zhang, Sijin Chen, Hongyuan Zhu, Fukun Yin, Gang Yu, Tao Chen

Viaarxiv icon

FaceStudio: Put Your Face Everywhere in Seconds

Add code
Bookmark button
Alert button
Dec 06, 2023
Yuxuan Yan, Chi Zhang, Rui Wang, Yichao Zhou, Gege Zhang, Pei Cheng, Gang Yu, Bin Fu

Figure 1 for FaceStudio: Put Your Face Everywhere in Seconds
Figure 2 for FaceStudio: Put Your Face Everywhere in Seconds
Figure 3 for FaceStudio: Put Your Face Everywhere in Seconds
Figure 4 for FaceStudio: Put Your Face Everywhere in Seconds
Viaarxiv icon

ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model

Add code
Bookmark button
Alert button
Dec 01, 2023
Fukun Yin, Xin Chen, Chi Zhang, Biao Jiang, Zibo Zhao, Jiayuan Fan, Gang Yu, Taihao Li, Tao Chen

Figure 1 for ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Figure 2 for ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Figure 3 for ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Figure 4 for ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Viaarxiv icon

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

Add code
Bookmark button
Alert button
Nov 30, 2023
Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen

Figure 1 for LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Figure 2 for LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Figure 3 for LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Figure 4 for LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Viaarxiv icon