Picture for Kumar Ashutosh

Kumar Ashutosh

IIT Bombay

SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos

Add code
Apr 08, 2024
Figure 1 for SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Figure 2 for SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Figure 3 for SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Figure 4 for SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Viaarxiv icon

Detours for Navigating Instructional Videos

Add code
Jan 03, 2024
Viaarxiv icon

Learning Object State Changes in Videos: An Open-World Perspective

Add code
Dec 19, 2023
Viaarxiv icon

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Add code
Nov 30, 2023
Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos

Add code
Jul 17, 2023
Figure 1 for Video-Mined Task Graphs for Keystep Recognition in Instructional Videos
Figure 2 for Video-Mined Task Graphs for Keystep Recognition in Instructional Videos
Figure 3 for Video-Mined Task Graphs for Keystep Recognition in Instructional Videos
Figure 4 for Video-Mined Task Graphs for Keystep Recognition in Instructional Videos
Viaarxiv icon

What You Say Is What You Show: Visual Narration Detection in Instructional Videos

Add code
Jan 05, 2023
Figure 1 for What You Say Is What You Show: Visual Narration Detection in Instructional Videos
Figure 2 for What You Say Is What You Show: Visual Narration Detection in Instructional Videos
Figure 3 for What You Say Is What You Show: Visual Narration Detection in Instructional Videos
Figure 4 for What You Say Is What You Show: Visual Narration Detection in Instructional Videos
Viaarxiv icon

HierVL: Learning Hierarchical Video-Language Embeddings

Add code
Jan 05, 2023
Figure 1 for HierVL: Learning Hierarchical Video-Language Embeddings
Figure 2 for HierVL: Learning Hierarchical Video-Language Embeddings
Figure 3 for HierVL: Learning Hierarchical Video-Language Embeddings
Figure 4 for HierVL: Learning Hierarchical Video-Language Embeddings
Viaarxiv icon

RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

Add code
Oct 15, 2022
Figure 1 for RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
Figure 2 for RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
Figure 3 for RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
Figure 4 for RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
Viaarxiv icon

3D-NVS: A 3D Supervision Approach for Next View Selection

Add code
Dec 03, 2020
Figure 1 for 3D-NVS: A 3D Supervision Approach for Next View Selection
Figure 2 for 3D-NVS: A 3D Supervision Approach for Next View Selection
Figure 3 for 3D-NVS: A 3D Supervision Approach for Next View Selection
Figure 4 for 3D-NVS: A 3D Supervision Approach for Next View Selection
Viaarxiv icon

Lower Bounds for Policy Iteration on Multi-action MDPs

Add code
Sep 16, 2020
Figure 1 for Lower Bounds for Policy Iteration on Multi-action MDPs
Figure 2 for Lower Bounds for Policy Iteration on Multi-action MDPs
Figure 3 for Lower Bounds for Policy Iteration on Multi-action MDPs
Viaarxiv icon