Alert button
Picture for Lei Ji

Lei Ji

Alert button

Exploring Diffusion Time-steps for Unsupervised Representation Learning

Jan 21, 2024
Zhongqi Yue, Jiankun Wang, Qianru Sun, Lei Ji, Eric I-Chao Chang, Hanwang Zhang

Viaarxiv icon

ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

Jan 01, 2024
Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou

Viaarxiv icon

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

Oct 28, 2023
Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei Ji, Eric I-Chao Chang, Tackeun Kim, Edward Choi

Viaarxiv icon

KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

Jul 10, 2023
Gangwoo Kim, Hajung Kim, Lei Ji, Seongsu Bae, Chanhwi Kim, Mujeen Sung, Hyunjae Kim, Kun Yan, Eric Chang, Jaewoo Kang

Figure 1 for KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization
Figure 2 for KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization
Viaarxiv icon

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Jun 28, 2023
Difei Gao, Lei Ji, Luowei Zhou, Kevin Qinghong Lin, Joya Chen, Zihan Fan, Mike Zheng Shou

Figure 1 for AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Figure 2 for AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Figure 3 for AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Figure 4 for AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Viaarxiv icon

GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

Jun 27, 2023
Zhijian Hou, Lei Ji, Difei Gao, Wanjun Zhong, Kun Yan, Chao Li, Wing-Kwong Chan, Chong-Wah Ngo, Nan Duan, Mike Zheng Shou

Figure 1 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 2 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 3 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 4 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Viaarxiv icon

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

Mar 29, 2023
Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan

Figure 1 for TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Figure 2 for TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Figure 3 for TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Figure 4 for TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Viaarxiv icon

MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering

Dec 19, 2022
Difei Gao, Luowei Zhou, Lei Ji, Linchao Zhu, Yi Yang, Mike Zheng Shou

Figure 1 for MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Figure 2 for MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Figure 3 for MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Figure 4 for MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Viaarxiv icon