Alert button
Picture for Florian Metze

Florian Metze

Alert button

AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification

Add code
Bookmark button
Alert button
Apr 03, 2022
Juncheng B Li, Shuhui Qu, Po-Yao Huang, Florian Metze

Figure 1 for AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Figure 2 for AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Figure 3 for AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Figure 4 for AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Viaarxiv icon

On Adversarial Robustness of Large-scale Audio Visual Learning

Add code
Bookmark button
Alert button
Mar 23, 2022
Juncheng B Li, Shuhui Qu, Xinjian Li, Po-Yao, Huang, Florian Metze

Figure 1 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 2 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 3 for On Adversarial Robustness of Large-scale Audio Visual Learning
Figure 4 for On Adversarial Robustness of Large-scale Audio Visual Learning
Viaarxiv icon

Speech Summarization using Restricted Self-Attention

Add code
Bookmark button
Alert button
Oct 12, 2021
Roshan Sharma, Shruti Palaskar, Alan W Black, Florian Metze

Figure 1 for Speech Summarization using Restricted Self-Attention
Figure 2 for Speech Summarization using Restricted Self-Attention
Figure 3 for Speech Summarization using Restricted Self-Attention
Figure 4 for Speech Summarization using Restricted Self-Attention
Viaarxiv icon

VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

Add code
Bookmark button
Alert button
Oct 01, 2021
Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer

Figure 1 for VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Figure 2 for VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Figure 3 for VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Figure 4 for VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Viaarxiv icon

Differentiable Allophone Graphs for Language-Universal Speech Recognition

Add code
Bookmark button
Alert button
Jul 24, 2021
Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe

Figure 1 for Differentiable Allophone Graphs for Language-Universal Speech Recognition
Figure 2 for Differentiable Allophone Graphs for Language-Universal Speech Recognition
Figure 3 for Differentiable Allophone Graphs for Language-Universal Speech Recognition
Figure 4 for Differentiable Allophone Graphs for Language-Universal Speech Recognition
Viaarxiv icon

Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding

Add code
Bookmark button
Alert button
Jun 29, 2021
Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W Black

Figure 1 for Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Figure 2 for Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Figure 3 for Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Figure 4 for Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding
Viaarxiv icon

VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding

Add code
Bookmark button
Alert button
May 20, 2021
Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer

Figure 1 for VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
Figure 2 for VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
Figure 3 for VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
Figure 4 for VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
Viaarxiv icon

Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks

Add code
Bookmark button
Alert button
May 02, 2021
Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe

Figure 1 for Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Figure 2 for Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Figure 3 for Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Figure 4 for Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Viaarxiv icon

Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models

Add code
Bookmark button
Alert button
Apr 15, 2021
Po-Yao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, Alexander Hauptmann

Figure 1 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 2 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 3 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Figure 4 for Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Viaarxiv icon

Self-supervised object detection from audio-visual correspondence

Add code
Bookmark button
Alert button
Apr 13, 2021
Triantafyllos Afouras, Yuki M. Asano, Francois Fagan, Andrea Vedaldi, Florian Metze

Figure 1 for Self-supervised object detection from audio-visual correspondence
Figure 2 for Self-supervised object detection from audio-visual correspondence
Figure 3 for Self-supervised object detection from audio-visual correspondence
Figure 4 for Self-supervised object detection from audio-visual correspondence
Viaarxiv icon