Picture for Florian Metze

Florian Metze

Self-supervised object detection from audio-visual correspondence

Add code
Apr 13, 2021
Figure 1 for Self-supervised object detection from audio-visual correspondence
Figure 2 for Self-supervised object detection from audio-visual correspondence
Figure 3 for Self-supervised object detection from audio-visual correspondence
Figure 4 for Self-supervised object detection from audio-visual correspondence
Viaarxiv icon

Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning

Add code
Mar 18, 2021
Figure 1 for Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Figure 2 for Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Figure 3 for Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Figure 4 for Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Viaarxiv icon

NoiseQA: Challenge Set Evaluation for User-Centric Question Answering

Add code
Feb 16, 2021
Figure 1 for NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
Figure 2 for NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
Figure 3 for NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
Figure 4 for NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
Viaarxiv icon

Audio-Visual Event Recognition through the lens of Adversary

Add code
Nov 15, 2020
Figure 1 for Audio-Visual Event Recognition through the lens of Adversary
Figure 2 for Audio-Visual Event Recognition through the lens of Adversary
Figure 3 for Audio-Visual Event Recognition through the lens of Adversary
Figure 4 for Audio-Visual Event Recognition through the lens of Adversary
Viaarxiv icon

Multimodal Speech Recognition with Unstructured Audio Masking

Add code
Oct 16, 2020
Figure 1 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 2 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 3 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 4 for Multimodal Speech Recognition with Unstructured Audio Masking
Viaarxiv icon

On Long-Tailed Phenomena in Neural Machine Translation

Add code
Oct 10, 2020
Figure 1 for On Long-Tailed Phenomena in Neural Machine Translation
Figure 2 for On Long-Tailed Phenomena in Neural Machine Translation
Figure 3 for On Long-Tailed Phenomena in Neural Machine Translation
Figure 4 for On Long-Tailed Phenomena in Neural Machine Translation
Viaarxiv icon

Support-set bottlenecks for video-text representation learning

Add code
Oct 06, 2020
Figure 1 for Support-set bottlenecks for video-text representation learning
Figure 2 for Support-set bottlenecks for video-text representation learning
Figure 3 for Support-set bottlenecks for video-text representation learning
Figure 4 for Support-set bottlenecks for video-text representation learning
Viaarxiv icon

Fine-Grained Grounding for Multimodal Speech Recognition

Add code
Oct 05, 2020
Figure 1 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 2 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 3 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 4 for Fine-Grained Grounding for Multimodal Speech Recognition
Viaarxiv icon

Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations

Add code
Sep 12, 2020
Figure 1 for Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations
Figure 2 for Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations
Figure 3 for Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations
Figure 4 for Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations
Viaarxiv icon

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

Add code
Aug 18, 2020
Figure 1 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Figure 2 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Figure 3 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Viaarxiv icon