Picture for Zihui Xue

Zihui Xue

Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos

Add code
Jun 13, 2024
Viaarxiv icon

HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness

Add code
Jun 11, 2024
Figure 1 for HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
Figure 2 for HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
Figure 3 for HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
Figure 4 for HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
Viaarxiv icon

Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos

Add code
Mar 11, 2024
Figure 1 for Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
Figure 2 for Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
Figure 3 for Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
Figure 4 for Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
Viaarxiv icon

Detours for Navigating Instructional Videos

Add code
Jan 03, 2024
Viaarxiv icon

Learning Object State Changes in Videos: An Open-World Perspective

Add code
Dec 19, 2023
Viaarxiv icon

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Add code
Nov 30, 2023
Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment

Add code
Jun 08, 2023
Figure 1 for Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment
Figure 2 for Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment
Figure 3 for Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment
Figure 4 for Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment
Viaarxiv icon

Egocentric Video Task Translation @ Ego4D Challenge 2022

Add code
Feb 03, 2023
Figure 1 for Egocentric Video Task Translation @ Ego4D Challenge 2022
Figure 2 for Egocentric Video Task Translation @ Ego4D Challenge 2022
Figure 3 for Egocentric Video Task Translation @ Ego4D Challenge 2022
Figure 4 for Egocentric Video Task Translation @ Ego4D Challenge 2022
Viaarxiv icon

Egocentric Video Task Translation

Add code
Dec 13, 2022
Figure 1 for Egocentric Video Task Translation
Figure 2 for Egocentric Video Task Translation
Figure 3 for Egocentric Video Task Translation
Figure 4 for Egocentric Video Task Translation
Viaarxiv icon

The Modality Focusing Hypothesis: On the Blink of Multimodal Knowledge Distillation

Add code
Jun 13, 2022
Figure 1 for The Modality Focusing Hypothesis: On the Blink of Multimodal Knowledge Distillation
Figure 2 for The Modality Focusing Hypothesis: On the Blink of Multimodal Knowledge Distillation
Figure 3 for The Modality Focusing Hypothesis: On the Blink of Multimodal Knowledge Distillation
Figure 4 for The Modality Focusing Hypothesis: On the Blink of Multimodal Knowledge Distillation
Viaarxiv icon