Picture for Difei Gao

Difei Gao

Learning Video Context as Interleaved Multimodal Sequences

Add code
Jul 31, 2024
Viaarxiv icon

GUI Action Narrator: Where and When Did That Action Take Place?

Add code
Jun 19, 2024
Viaarxiv icon

VideoLLM-online: Online Video Large Language Model for Streaming Video

Add code
Jun 17, 2024
Viaarxiv icon

VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Add code
Jun 14, 2024
Viaarxiv icon

LOVA3: Learning to Visual Question Answering, Asking and Assessment

Add code
May 23, 2024
Viaarxiv icon

Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces

Add code
Jan 24, 2024
Viaarxiv icon

ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

Add code
Jan 01, 2024
Viaarxiv icon

ViT-Lens-2: Gateway to Omni-modal Intelligence

Add code
Nov 27, 2023
Viaarxiv icon

CVPR 2023 Text Guided Video Editing Competition

Add code
Oct 24, 2023
Viaarxiv icon

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

Add code
Sep 27, 2023
Viaarxiv icon