Picture for Lei Ji

Lei Ji

KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

Add code
Jul 10, 2023
Viaarxiv icon

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Add code
Jun 28, 2023
Viaarxiv icon

GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

Add code
Jun 27, 2023
Figure 1 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 2 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 3 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 4 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Viaarxiv icon

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

Add code
Mar 29, 2023
Figure 1 for TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Figure 2 for TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Figure 3 for TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Figure 4 for TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Viaarxiv icon

MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering

Add code
Dec 19, 2022
Viaarxiv icon

An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

Add code
Nov 16, 2022
Figure 1 for An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022
Figure 2 for An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022
Figure 3 for An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022
Figure 4 for An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022
Viaarxiv icon

HORIZON: A High-Resolution Panorama Synthesis Framework

Add code
Oct 10, 2022
Figure 1 for HORIZON: A High-Resolution Panorama Synthesis Framework
Figure 2 for HORIZON: A High-Resolution Panorama Synthesis Framework
Figure 3 for HORIZON: A High-Resolution Panorama Synthesis Framework
Figure 4 for HORIZON: A High-Resolution Panorama Synthesis Framework
Viaarxiv icon

CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding

Add code
Sep 22, 2022
Figure 1 for CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Figure 2 for CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Figure 3 for CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Figure 4 for CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Viaarxiv icon

ScaleVLAD: Improving Multimodal Sentiment Analysis via Multi-Scale Fusion of Locally Descriptors

Add code
Dec 02, 2021
Figure 1 for ScaleVLAD: Improving Multimodal Sentiment Analysis via Multi-Scale Fusion of Locally Descriptors
Figure 2 for ScaleVLAD: Improving Multimodal Sentiment Analysis via Multi-Scale Fusion of Locally Descriptors
Figure 3 for ScaleVLAD: Improving Multimodal Sentiment Analysis via Multi-Scale Fusion of Locally Descriptors
Figure 4 for ScaleVLAD: Improving Multimodal Sentiment Analysis via Multi-Scale Fusion of Locally Descriptors
Viaarxiv icon

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

Add code
Nov 24, 2021
Figure 1 for NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Figure 2 for NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Figure 3 for NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Figure 4 for NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Viaarxiv icon