Alert button
Picture for Jifeng Dai

Jifeng Dai

Alert button

Denoising Diffusion Semantic Segmentation with Mask Prior Modeling

Add code
Bookmark button
Alert button
Jun 22, 2023
Zeqiang Lai, Yuchen Duan, Jifeng Dai, Ziheng Li, Ying Fu, Hongsheng Li, Yu Qiao, Wenhai Wang

Figure 1 for Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Figure 2 for Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Figure 3 for Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Figure 4 for Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Viaarxiv icon

ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process

Add code
Bookmark button
Alert button
Jun 08, 2023
Changyao Tian, Chenxin Tao, Jifeng Dai, Hao Li, Ziheng Li, Lewei Lu, Xiaogang Wang, Hongsheng Li, Gao Huang, Xizhou Zhu

Figure 1 for ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process
Figure 2 for ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process
Figure 3 for ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process
Figure 4 for ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process
Viaarxiv icon

FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow

Add code
Bookmark button
Alert button
Jun 08, 2023
Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Yijin Li, Hongwei Qin, Jifeng Dai, Xiaogang Wang, Hongsheng Li

Figure 1 for FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow
Figure 2 for FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow
Figure 3 for FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow
Figure 4 for FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow
Viaarxiv icon

Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory

Add code
Bookmark button
Alert button
Jun 01, 2023
Xizhou Zhu, Yuntao Chen, Hao Tian, Chenxin Tao, Weijie Su, Chenyu Yang, Gao Huang, Bin Li, Lewei Lu, Xiaogang Wang, Yu Qiao, Zhaoxiang Zhang, Jifeng Dai

Figure 1 for Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory
Figure 2 for Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory
Figure 3 for Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory
Figure 4 for Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory
Viaarxiv icon

VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks

Add code
Bookmark button
Alert button
May 25, 2023
Wenhai Wang, Zhe Chen, Xiaokang Chen, Jiannan Wu, Xizhou Zhu, Gang Zeng, Ping Luo, Tong Lu, Jie Zhou, Yu Qiao, Jifeng Dai

Figure 1 for VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
Figure 2 for VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
Figure 3 for VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
Figure 4 for VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
Viaarxiv icon

EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought

Add code
Bookmark button
Alert button
May 24, 2023
Yao Mu, Qinglong Zhang, Mengkang Hu, Wenhai Wang, Mingyu Ding, Jun Jin, Bin Wang, Jifeng Dai, Yu Qiao, Ping Luo

Figure 1 for EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
Figure 2 for EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
Figure 3 for EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
Figure 4 for EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
Viaarxiv icon

InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

Add code
Bookmark button
Alert button
May 11, 2023
Zhaoyang Liu, Yinan He, Wenhai Wang, Weiyun Wang, Yi Wang, Shoufa Chen, Qinglong Zhang, Yang Yang, Qingyun Li, Jiashuo Yu, Kunchang Li, Zhe Chen, Xue Yang, Xizhou Zhu, Yali Wang, Limin Wang, Ping Luo, Jifeng Dai, Yu Qiao

Figure 1 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 2 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 3 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 4 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Viaarxiv icon

VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation

Add code
Bookmark button
Alert button
Mar 17, 2023
Xiaoyu Shi, Zhaoyang Huang, Weikang Bian, Dasong Li, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li

Figure 1 for VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
Figure 2 for VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
Figure 3 for VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
Figure 4 for VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
Viaarxiv icon