Picture for Jifeng Dai

Jifeng Dai

Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft

Add code
Dec 14, 2023
Figure 1 for Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Figure 2 for Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Figure 3 for Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Figure 4 for Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Viaarxiv icon

InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation

Add code
Nov 30, 2023
Figure 1 for InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation
Figure 2 for InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation
Figure 3 for InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation
Figure 4 for InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation
Viaarxiv icon

Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision

Add code
Nov 23, 2023
Viaarxiv icon

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Add code
Oct 30, 2023
Figure 1 for ControlLLM: Augment Language Models with Tools by Searching on Graphs
Figure 2 for ControlLLM: Augment Language Models with Tools by Searching on Graphs
Figure 3 for ControlLLM: Augment Language Models with Tools by Searching on Graphs
Figure 4 for ControlLLM: Augment Language Models with Tools by Searching on Graphs
Viaarxiv icon

Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models

Add code
Oct 12, 2023
Figure 1 for Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
Figure 2 for Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
Figure 3 for Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
Figure 4 for Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
Viaarxiv icon

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World

Add code
Aug 03, 2023
Viaarxiv icon

JourneyDB: A Benchmark for Generative Image Understanding

Add code
Jul 03, 2023
Viaarxiv icon

Denoising Diffusion Semantic Segmentation with Mask Prior Modeling

Add code
Jun 22, 2023
Figure 1 for Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Figure 2 for Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Figure 3 for Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Figure 4 for Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Viaarxiv icon

FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow

Add code
Jun 08, 2023
Viaarxiv icon

ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process

Add code
Jun 08, 2023
Viaarxiv icon