Vlog Dataset


A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model

Add code
Nov 07, 2024
Viaarxiv icon

We Care: Multimodal Depression Detection and Knowledge Infused Mental Health Therapeutic Response Generation

Add code
Jun 15, 2024
Viaarxiv icon

MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection

Add code
Mar 21, 2024
Figure 1 for MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection
Figure 2 for MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection
Figure 3 for MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection
Figure 4 for MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection
Viaarxiv icon

Understanding Video Scenes through Text: Insights from Text-based Video Question Answering

Add code
Sep 11, 2023
Viaarxiv icon

Human Action Co-occurrence in Lifestyle Vlogs using Graph Link Prediction

Add code
Sep 22, 2023
Viaarxiv icon

Album Storytelling with Iterative Story-aware Captioning and Large Language Models

Add code
May 24, 2023
Figure 1 for Album Storytelling with Iterative Story-aware Captioning and Large Language Models
Figure 2 for Album Storytelling with Iterative Story-aware Captioning and Large Language Models
Figure 3 for Album Storytelling with Iterative Story-aware Captioning and Large Language Models
Figure 4 for Album Storytelling with Iterative Story-aware Captioning and Large Language Models
Viaarxiv icon

Open-Domain Sign Language Translation Learned from Online Video

Add code
May 25, 2022
Figure 1 for Open-Domain Sign Language Translation Learned from Online Video
Figure 2 for Open-Domain Sign Language Translation Learned from Online Video
Figure 3 for Open-Domain Sign Language Translation Learned from Online Video
Figure 4 for Open-Domain Sign Language Translation Learned from Online Video
Viaarxiv icon

When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs

Add code
Feb 21, 2022
Figure 1 for When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Figure 2 for When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Figure 3 for When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Figure 4 for When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Viaarxiv icon

A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with Transformer

Add code
Dec 09, 2021
Figure 1 for A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with Transformer
Figure 2 for A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with Transformer
Figure 3 for A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with Transformer
Figure 4 for A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with Transformer
Viaarxiv icon

WhyAct: Identifying Action Reasons in Lifestyle Vlogs

Add code
Sep 09, 2021
Figure 1 for WhyAct: Identifying Action Reasons in Lifestyle Vlogs
Figure 2 for WhyAct: Identifying Action Reasons in Lifestyle Vlogs
Figure 3 for WhyAct: Identifying Action Reasons in Lifestyle Vlogs
Figure 4 for WhyAct: Identifying Action Reasons in Lifestyle Vlogs
Viaarxiv icon