Picture for Yoichi Sato

Yoichi Sato

WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding

Add code
Jul 22, 2024
Figure 1 for WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Figure 2 for WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Figure 3 for WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Figure 4 for WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Viaarxiv icon

ActionVOS: Actions as Prompts for Video Object Segmentation

Add code
Jul 10, 2024
Figure 1 for ActionVOS: Actions as Prompts for Video Object Segmentation
Figure 2 for ActionVOS: Actions as Prompts for Video Object Segmentation
Figure 3 for ActionVOS: Actions as Prompts for Video Object Segmentation
Figure 4 for ActionVOS: Actions as Prompts for Video Object Segmentation
Viaarxiv icon

Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition

Add code
Jul 09, 2024
Viaarxiv icon

Learning Object States from Actions via Large Language Models

Add code
May 02, 2024
Figure 1 for Learning Object States from Actions via Large Language Models
Figure 2 for Learning Object States from Actions via Large Language Models
Figure 3 for Learning Object States from Actions via Large Language Models
Figure 4 for Learning Object States from Actions via Large Language Models
Viaarxiv icon

Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

Add code
Mar 25, 2024
Figure 1 for Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects
Figure 2 for Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects
Figure 3 for Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects
Figure 4 for Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects
Viaarxiv icon

Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation

Add code
Mar 09, 2024
Figure 1 for Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation
Figure 2 for Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation
Figure 3 for Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation
Figure 4 for Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation
Viaarxiv icon

FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation

Add code
Feb 01, 2024
Figure 1 for FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation
Figure 2 for FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation
Figure 3 for FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation
Figure 4 for FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation
Viaarxiv icon

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Add code
Nov 30, 2023
Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos

Add code
Nov 29, 2023
Figure 1 for Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos
Figure 2 for Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos
Figure 3 for Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos
Figure 4 for Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos
Viaarxiv icon

Generative Hierarchical Temporal Transformer for Hand Action Recognition and Motion Prediction

Add code
Nov 29, 2023
Figure 1 for Generative Hierarchical Temporal Transformer for Hand Action Recognition and Motion Prediction
Figure 2 for Generative Hierarchical Temporal Transformer for Hand Action Recognition and Motion Prediction
Figure 3 for Generative Hierarchical Temporal Transformer for Hand Action Recognition and Motion Prediction
Figure 4 for Generative Hierarchical Temporal Transformer for Hand Action Recognition and Motion Prediction
Viaarxiv icon