Picture for Nakul Agarwal

Nakul Agarwal

M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

Add code
Jul 19, 2024
Viaarxiv icon

Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models

Add code
May 30, 2024
Viaarxiv icon

Multi-Objective Recommendation via Multivariate Policy Learning

Add code
May 03, 2024
Viaarxiv icon

Disentangled Neural Relational Inference for Interpretable Motion Prediction

Add code
Jan 07, 2024
Viaarxiv icon

Vamos: Versatile Action Models for Video Understanding

Add code
Nov 22, 2023
Figure 1 for Vamos: Versatile Action Models for Video Understanding
Figure 2 for Vamos: Versatile Action Models for Video Understanding
Figure 3 for Vamos: Versatile Action Models for Video Understanding
Figure 4 for Vamos: Versatile Action Models for Video Understanding
Viaarxiv icon

Object-centric Video Representation for Long-term Action Anticipation

Add code
Oct 31, 2023
Figure 1 for Object-centric Video Representation for Long-term Action Anticipation
Figure 2 for Object-centric Video Representation for Long-term Action Anticipation
Figure 3 for Object-centric Video Representation for Long-term Action Anticipation
Figure 4 for Object-centric Video Representation for Long-term Action Anticipation
Viaarxiv icon

Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning

Add code
Sep 12, 2023
Figure 1 for Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
Figure 2 for Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
Figure 3 for Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
Figure 4 for Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
Viaarxiv icon

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

Add code
Jul 31, 2023
Figure 1 for AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
Figure 2 for AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
Figure 3 for AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
Figure 4 for AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
Viaarxiv icon

Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos

Add code
Mar 24, 2022
Figure 1 for Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
Figure 2 for Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
Figure 3 for Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
Figure 4 for Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
Viaarxiv icon

Unsupervised Domain Adaptation for Spatio-Temporal Action Localization

Add code
Oct 19, 2020
Figure 1 for Unsupervised Domain Adaptation for Spatio-Temporal Action Localization
Figure 2 for Unsupervised Domain Adaptation for Spatio-Temporal Action Localization
Figure 3 for Unsupervised Domain Adaptation for Spatio-Temporal Action Localization
Figure 4 for Unsupervised Domain Adaptation for Spatio-Temporal Action Localization
Viaarxiv icon