Alert button
Picture for Zhengyuan Yang

Zhengyuan Yang

Alert button

MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos

Add code
Bookmark button
Alert button
Jun 07, 2023
Jielin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Bo Li, Ding Zhao, Lijuan Wang

Figure 1 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 2 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 3 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 4 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Viaarxiv icon

Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation

Add code
Bookmark button
Alert button
Apr 14, 2023
Jaemin Cho, Linjie Li, Zhengyuan Yang, Zhe Gan, Lijuan Wang, Mohit Bansal

Figure 1 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 2 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 3 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 4 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Viaarxiv icon

Equivariant Similarity for Vision-Language Foundation Models

Add code
Bookmark button
Alert button
Mar 25, 2023
Tan Wang, Kevin Lin, Linjie Li, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang

Figure 1 for Equivariant Similarity for Vision-Language Foundation Models
Figure 2 for Equivariant Similarity for Vision-Language Foundation Models
Figure 3 for Equivariant Similarity for Vision-Language Foundation Models
Figure 4 for Equivariant Similarity for Vision-Language Foundation Models
Viaarxiv icon

Revisiting Transformer for Point Cloud-based 3D Scene Graph Generation

Add code
Bookmark button
Alert button
Mar 23, 2023
Changsheng Lv, Mengshi Qi, Xia Li, Zhengyuan Yang, Huadong Ma

Figure 1 for Revisiting Transformer for Point Cloud-based 3D Scene Graph Generation
Figure 2 for Revisiting Transformer for Point Cloud-based 3D Scene Graph Generation
Figure 3 for Revisiting Transformer for Point Cloud-based 3D Scene Graph Generation
Figure 4 for Revisiting Transformer for Point Cloud-based 3D Scene Graph Generation
Viaarxiv icon

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

Add code
Bookmark button
Alert button
Mar 22, 2023
Shengming Yin, Chenfei Wu, Huan Yang, Jianfeng Wang, Xiaodong Wang, Minheng Ni, Zhengyuan Yang, Linjie Li, Shuguang Liu, Fan Yang, Jianlong Fu, Gong Ming, Lijuan Wang, Zicheng Liu, Houqiang Li, Nan Duan

Figure 1 for NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Figure 2 for NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Figure 3 for NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Figure 4 for NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Viaarxiv icon

MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action

Add code
Bookmark button
Alert button
Mar 20, 2023
Zhengyuan Yang, Linjie Li, Jianfeng Wang, Kevin Lin, Ehsan Azarnasab, Faisal Ahmed, Zicheng Liu, Ce Liu, Michael Zeng, Lijuan Wang

Figure 1 for MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Figure 2 for MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Figure 3 for MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Figure 4 for MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Viaarxiv icon

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images

Add code
Bookmark button
Alert button
Feb 21, 2023
Xiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan

Figure 1 for Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Figure 2 for Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Figure 3 for Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Figure 4 for Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Viaarxiv icon

GRiT: A Generative Region-to-text Transformer for Object Understanding

Add code
Bookmark button
Alert button
Dec 01, 2022
Jialian Wu, Jianfeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang

Figure 1 for GRiT: A Generative Region-to-text Transformer for Object Understanding
Figure 2 for GRiT: A Generative Region-to-text Transformer for Object Understanding
Figure 3 for GRiT: A Generative Region-to-text Transformer for Object Understanding
Figure 4 for GRiT: A Generative Region-to-text Transformer for Object Understanding
Viaarxiv icon

ReCo: Region-Controlled Text-to-Image Generation

Add code
Bookmark button
Alert button
Nov 23, 2022
Zhengyuan Yang, Jianfeng Wang, Zhe Gan, Linjie Li, Kevin Lin, Chenfei Wu, Nan Duan, Zicheng Liu, Ce Liu, Michael Zeng, Lijuan Wang

Figure 1 for ReCo: Region-Controlled Text-to-Image Generation
Figure 2 for ReCo: Region-Controlled Text-to-Image Generation
Figure 3 for ReCo: Region-Controlled Text-to-Image Generation
Figure 4 for ReCo: Region-Controlled Text-to-Image Generation
Viaarxiv icon