Picture for Dima Damen

Dima Damen

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Add code
Nov 30, 2023
Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Centre Stage: Centricity-based Audio-Visual Temporal Action Detection

Add code
Nov 28, 2023
Viaarxiv icon

Learning Temporal Sentence Grounding From Narrated EgoVideos

Add code
Oct 26, 2023
Figure 1 for Learning Temporal Sentence Grounding From Narrated EgoVideos
Figure 2 for Learning Temporal Sentence Grounding From Narrated EgoVideos
Figure 3 for Learning Temporal Sentence Grounding From Narrated EgoVideos
Figure 4 for Learning Temporal Sentence Grounding From Narrated EgoVideos
Viaarxiv icon

An Outlook into the Future of Egocentric Vision

Add code
Aug 14, 2023
Figure 1 for An Outlook into the Future of Egocentric Vision
Figure 2 for An Outlook into the Future of Egocentric Vision
Figure 3 for An Outlook into the Future of Egocentric Vision
Figure 4 for An Outlook into the Future of Egocentric Vision
Viaarxiv icon

What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations

Add code
Jun 14, 2023
Figure 1 for What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations
Figure 2 for What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations
Figure 3 for What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations
Figure 4 for What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations
Viaarxiv icon

EPIC Fields: Marrying 3D Geometry and Video Understanding

Add code
Jun 14, 2023
Figure 1 for EPIC Fields: Marrying 3D Geometry and Video Understanding
Figure 2 for EPIC Fields: Marrying 3D Geometry and Video Understanding
Figure 3 for EPIC Fields: Marrying 3D Geometry and Video Understanding
Figure 4 for EPIC Fields: Marrying 3D Geometry and Video Understanding
Viaarxiv icon

Perception Test: A Diagnostic Benchmark for Multimodal Video Models

Add code
May 23, 2023
Figure 1 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 2 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 3 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 4 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Viaarxiv icon

Use Your Head: Improving Long-Tail Video Recognition

Add code
Apr 03, 2023
Figure 1 for Use Your Head: Improving Long-Tail Video Recognition
Figure 2 for Use Your Head: Improving Long-Tail Video Recognition
Figure 3 for Use Your Head: Improving Long-Tail Video Recognition
Figure 4 for Use Your Head: Improving Long-Tail Video Recognition
Viaarxiv icon

Epic-Sounds: A Large-scale Dataset of Actions That Sound

Add code
Feb 01, 2023
Figure 1 for Epic-Sounds: A Large-scale Dataset of Actions That Sound
Figure 2 for Epic-Sounds: A Large-scale Dataset of Actions That Sound
Figure 3 for Epic-Sounds: A Large-scale Dataset of Actions That Sound
Figure 4 for Epic-Sounds: A Large-scale Dataset of Actions That Sound
Viaarxiv icon

Refining Action Boundaries for One-stage Detection

Add code
Oct 25, 2022
Viaarxiv icon