Picture for Xun Guo

Xun Guo

I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models

Add code
Dec 27, 2023
Viaarxiv icon

Hulk: A Universal Knowledge Translator for Human-Centric Tasks

Add code
Dec 05, 2023
Viaarxiv icon

StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Add code
Aug 18, 2023
Figure 1 for StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Figure 2 for StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Figure 3 for StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Figure 4 for StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Viaarxiv icon

MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

Add code
Jul 31, 2023
Figure 1 for MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Figure 2 for MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Figure 3 for MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Figure 4 for MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Viaarxiv icon

Alignment-guided Temporal Attention for Video Action Recognition

Add code
Sep 30, 2022
Figure 1 for Alignment-guided Temporal Attention for Video Action Recognition
Figure 2 for Alignment-guided Temporal Attention for Video Action Recognition
Figure 3 for Alignment-guided Temporal Attention for Video Action Recognition
Figure 4 for Alignment-guided Temporal Attention for Video Action Recognition
Viaarxiv icon

Rethinking Minimal Sufficient Representation in Contrastive Learning

Add code
Apr 02, 2022
Figure 1 for Rethinking Minimal Sufficient Representation in Contrastive Learning
Figure 2 for Rethinking Minimal Sufficient Representation in Contrastive Learning
Figure 3 for Rethinking Minimal Sufficient Representation in Contrastive Learning
Figure 4 for Rethinking Minimal Sufficient Representation in Contrastive Learning
Viaarxiv icon

Semantic-aligned Fusion Transformer for One-shot Object Detection

Add code
Mar 20, 2022
Figure 1 for Semantic-aligned Fusion Transformer for One-shot Object Detection
Figure 2 for Semantic-aligned Fusion Transformer for One-shot Object Detection
Figure 3 for Semantic-aligned Fusion Transformer for One-shot Object Detection
Figure 4 for Semantic-aligned Fusion Transformer for One-shot Object Detection
Viaarxiv icon

Self-Supervised Video Representation Learning with Meta-Contrastive Network

Add code
Aug 23, 2021
Figure 1 for Self-Supervised Video Representation Learning with Meta-Contrastive Network
Figure 2 for Self-Supervised Video Representation Learning with Meta-Contrastive Network
Figure 3 for Self-Supervised Video Representation Learning with Meta-Contrastive Network
Figure 4 for Self-Supervised Video Representation Learning with Meta-Contrastive Network
Viaarxiv icon

SSAN: Separable Self-Attention Network for Video Representation Learning

Add code
May 27, 2021
Figure 1 for SSAN: Separable Self-Attention Network for Video Representation Learning
Figure 2 for SSAN: Separable Self-Attention Network for Video Representation Learning
Figure 3 for SSAN: Separable Self-Attention Network for Video Representation Learning
Figure 4 for SSAN: Separable Self-Attention Network for Video Representation Learning
Viaarxiv icon

In Defense of the Classification Loss for Person Re-Identification

Add code
Sep 16, 2018
Figure 1 for In Defense of the Classification Loss for Person Re-Identification
Figure 2 for In Defense of the Classification Loss for Person Re-Identification
Figure 3 for In Defense of the Classification Loss for Person Re-Identification
Figure 4 for In Defense of the Classification Loss for Person Re-Identification
Viaarxiv icon