Alert button
Picture for Linjie Li

Linjie Li

Alert button

Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation

Oct 12, 2023
Zhengyuan Yang, Jianfeng Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Lijuan Wang

Figure 1 for Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Figure 2 for Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Figure 3 for Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Figure 4 for Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Viaarxiv icon

OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation

Oct 11, 2023
Jie An, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Kevin Lin, Zicheng Liu, Lijuan Wang, Jiebo Luo

Figure 1 for OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Figure 2 for OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Figure 3 for OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Figure 4 for OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Viaarxiv icon

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)

Oct 11, 2023
Zhengyuan Yang, Linjie Li, Kevin Lin, Jianfeng Wang, Chung-Ching Lin, Zicheng Liu, Lijuan Wang

Figure 1 for The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Figure 2 for The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Figure 3 for The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Figure 4 for The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Viaarxiv icon

Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Sep 18, 2023
Chunyuan Li, Zhe Gan, Zhengyuan Yang, Jianwei Yang, Linjie Li, Lijuan Wang, Jianfeng Gao

Figure 1 for Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Figure 2 for Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Figure 3 for Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Figure 4 for Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Viaarxiv icon

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities

Aug 04, 2023
Weihao Yu, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Kevin Lin, Zicheng Liu, Xinchao Wang, Lijuan Wang

Figure 1 for MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Figure 2 for MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Figure 3 for MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Figure 4 for MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Viaarxiv icon

Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models

Jul 27, 2023
Xin Yuan, Linjie Li, Jianfeng Wang, Zhengyuan Yang, Kevin Lin, Zicheng Liu, Lijuan Wang

Figure 1 for Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models
Figure 2 for Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models
Figure 3 for Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models
Figure 4 for Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models
Viaarxiv icon

DisCo: Disentangled Control for Referring Human Dance Generation in Real World

Jun 30, 2023
Tan Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang

Figure 1 for DisCo: Disentangled Control for Referring Human Dance Generation in Real World
Figure 2 for DisCo: Disentangled Control for Referring Human Dance Generation in Real World
Figure 3 for DisCo: Disentangled Control for Referring Human Dance Generation in Real World
Figure 4 for DisCo: Disentangled Control for Referring Human Dance Generation in Real World
Viaarxiv icon

Aligning Large Multi-Modal Model with Robust Instruction Tuning

Jun 26, 2023
Fuxiao Liu, Kevin Lin, Linjie Li, Jianfeng Wang, Yaser Yacoob, Lijuan Wang

Figure 1 for Aligning Large Multi-Modal Model with Robust Instruction Tuning
Figure 2 for Aligning Large Multi-Modal Model with Robust Instruction Tuning
Figure 3 for Aligning Large Multi-Modal Model with Robust Instruction Tuning
Figure 4 for Aligning Large Multi-Modal Model with Robust Instruction Tuning
Viaarxiv icon

MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos

Jun 07, 2023
Jielin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Bo Li, Ding Zhao, Lijuan Wang

Figure 1 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 2 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 3 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Figure 4 for MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Viaarxiv icon