Picture for Gengyuan Zhang

Gengyuan Zhang

Localizing Events in Videos with Multimodal Queries

Add code
Jun 14, 2024
Figure 1 for Localizing Events in Videos with Multimodal Queries
Figure 2 for Localizing Events in Videos with Multimodal Queries
Figure 3 for Localizing Events in Videos with Multimodal Queries
Figure 4 for Localizing Events in Videos with Multimodal Queries
Viaarxiv icon

SPOT! Revisiting Video-Language Models for Event Understanding

Add code
Dec 01, 2023
Figure 1 for SPOT! Revisiting Video-Language Models for Event Understanding
Figure 2 for SPOT! Revisiting Video-Language Models for Event Understanding
Figure 3 for SPOT! Revisiting Video-Language Models for Event Understanding
Figure 4 for SPOT! Revisiting Video-Language Models for Event Understanding
Viaarxiv icon

Multi-event Video-Text Retrieval

Add code
Aug 22, 2023
Figure 1 for Multi-event Video-Text Retrieval
Figure 2 for Multi-event Video-Text Retrieval
Figure 3 for Multi-event Video-Text Retrieval
Figure 4 for Multi-event Video-Text Retrieval
Viaarxiv icon

A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models

Add code
Jul 24, 2023
Viaarxiv icon

Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning

Add code
Jul 12, 2023
Figure 1 for Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
Figure 2 for Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
Figure 3 for Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
Figure 4 for Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
Viaarxiv icon

CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering

Add code
Nov 19, 2022
Figure 1 for CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering
Figure 2 for CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering
Figure 3 for CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering
Figure 4 for CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering
Viaarxiv icon