Picture for Lei Ji

Lei Ji

G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios

Add code
May 13, 2024
Viaarxiv icon

Exploring Diffusion Time-steps for Unsupervised Representation Learning

Add code
Jan 21, 2024
Viaarxiv icon

ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

Add code
Jan 01, 2024
Viaarxiv icon

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

Add code
Oct 28, 2023
Viaarxiv icon

KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

Add code
Jul 10, 2023
Viaarxiv icon

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Add code
Jun 28, 2023
Viaarxiv icon

GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

Add code
Jun 27, 2023
Viaarxiv icon

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

Add code
Mar 29, 2023
Viaarxiv icon

MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering

Add code
Dec 19, 2022
Viaarxiv icon

An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

Add code
Nov 16, 2022
Viaarxiv icon