Picture for Andrew Zisserman

Andrew Zisserman

DeepMind

Temporal Query Networks for Fine-grained Video Understanding

Add code
Apr 19, 2021
Figure 1 for Temporal Query Networks for Fine-grained Video Understanding
Figure 2 for Temporal Query Networks for Fine-grained Video Understanding
Figure 3 for Temporal Query Networks for Fine-grained Video Understanding
Figure 4 for Temporal Query Networks for Fine-grained Video Understanding
Viaarxiv icon

TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval

Add code
Apr 16, 2021
Figure 1 for TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Figure 2 for TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Figure 3 for TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Figure 4 for TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Viaarxiv icon

Self-supervised Video Object Segmentation by Motion Grouping

Add code
Apr 15, 2021
Figure 1 for Self-supervised Video Object Segmentation by Motion Grouping
Figure 2 for Self-supervised Video Object Segmentation by Motion Grouping
Figure 3 for Self-supervised Video Object Segmentation by Motion Grouping
Figure 4 for Self-supervised Video Object Segmentation by Motion Grouping
Viaarxiv icon

Localizing Visual Sounds the Hard Way

Add code
Apr 06, 2021
Figure 1 for Localizing Visual Sounds the Hard Way
Figure 2 for Localizing Visual Sounds the Hard Way
Figure 3 for Localizing Visual Sounds the Hard Way
Figure 4 for Localizing Visual Sounds the Hard Way
Viaarxiv icon

Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval

Add code
Apr 01, 2021
Figure 1 for Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Figure 2 for Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Figure 3 for Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Figure 4 for Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Viaarxiv icon

Broaden Your Views for Self-Supervised Video Learning

Add code
Mar 30, 2021
Figure 1 for Broaden Your Views for Self-Supervised Video Learning
Figure 2 for Broaden Your Views for Self-Supervised Video Learning
Figure 3 for Broaden Your Views for Self-Supervised Video Learning
Figure 4 for Broaden Your Views for Self-Supervised Video Learning
Viaarxiv icon

Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers

Add code
Mar 30, 2021
Figure 1 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 2 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 3 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 4 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Viaarxiv icon

Read and Attend: Temporal Localisation in Sign Language Videos

Add code
Mar 30, 2021
Figure 1 for Read and Attend: Temporal Localisation in Sign Language Videos
Figure 2 for Read and Attend: Temporal Localisation in Sign Language Videos
Figure 3 for Read and Attend: Temporal Localisation in Sign Language Videos
Figure 4 for Read and Attend: Temporal Localisation in Sign Language Videos
Viaarxiv icon

Slow-Fast Auditory Streams For Audio Recognition

Add code
Mar 05, 2021
Figure 1 for Slow-Fast Auditory Streams For Audio Recognition
Figure 2 for Slow-Fast Auditory Streams For Audio Recognition
Figure 3 for Slow-Fast Auditory Streams For Audio Recognition
Figure 4 for Slow-Fast Auditory Streams For Audio Recognition
Viaarxiv icon

Perceiver: General Perception with Iterative Attention

Add code
Mar 04, 2021
Figure 1 for Perceiver: General Perception with Iterative Attention
Figure 2 for Perceiver: General Perception with Iterative Attention
Figure 3 for Perceiver: General Perception with Iterative Attention
Figure 4 for Perceiver: General Perception with Iterative Attention
Viaarxiv icon