Picture for Florian Metze

Florian Metze

Robustness of Neural Architectures for Audio Event Detection

Add code
May 06, 2022
Figure 1 for Robustness of Neural Architectures for Audio Event Detection
Figure 2 for Robustness of Neural Architectures for Audio Event Detection
Figure 3 for Robustness of Neural Architectures for Audio Event Detection
Figure 4 for Robustness of Neural Architectures for Audio Event Detection
Viaarxiv icon

AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification

Add code
Apr 03, 2022
Figure 1 for AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Figure 2 for AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Figure 3 for AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Figure 4 for AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Viaarxiv icon

On Adversarial Robustness of Large-scale Audio Visual Learning

Add code
Mar 23, 2022
Figure 1 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 2 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 3 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 4 for On Adversarial Robustness of Large-scale Audio Visual Learning
Viaarxiv icon

Speech Summarization using Restricted Self-Attention

Add code
Oct 12, 2021
Figure 1 for Speech Summarization using Restricted Self-Attention
Figure 2 for Speech Summarization using Restricted Self-Attention
Figure 3 for Speech Summarization using Restricted Self-Attention
Figure 4 for Speech Summarization using Restricted Self-Attention
Viaarxiv icon

VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

Add code
Oct 01, 2021
Figure 1 for VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Figure 2 for VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Figure 3 for VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Figure 4 for VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Viaarxiv icon

Differentiable Allophone Graphs for Language-Universal Speech Recognition

Add code
Jul 24, 2021
Figure 1 for Differentiable Allophone Graphs for Language-Universal Speech Recognition
Figure 2 for Differentiable Allophone Graphs for Language-Universal Speech Recognition
Figure 3 for Differentiable Allophone Graphs for Language-Universal Speech Recognition
Figure 4 for Differentiable Allophone Graphs for Language-Universal Speech Recognition
Viaarxiv icon

Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding

Add code
Jun 29, 2021
Figure 1 for Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Figure 2 for Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Figure 3 for Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Figure 4 for Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Viaarxiv icon

VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding

Add code
May 20, 2021
Figure 1 for VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
Figure 2 for VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
Figure 3 for VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
Figure 4 for VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
Viaarxiv icon

Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks

Add code
May 02, 2021
Figure 1 for Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Figure 2 for Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Figure 3 for Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Figure 4 for Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Viaarxiv icon

Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models

Add code
Apr 15, 2021
Figure 1 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 2 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 3 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 4 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Viaarxiv icon