Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

M&M Mix: A Multimodal Multiview Transformer Ensemble



Xuehan Xiong , Anurag Arnab , Arsha Nagrani , Cordelia Schmid

* Technical report for Epic-Kitchens challenge 2022 

   Access Paper or Ask Questions

AVATAR: Unconstrained Audiovisual Speech Recognition



Valentin Gabeur , Paul Hongsuck Seo , Arsha Nagrani , Chen Sun , Karteek Alahari , Cordelia Schmid


   Access Paper or Ask Questions

A CLIP-Hitchhiker's Guide to Long Video Retrieval



Max Bain , Arsha Nagrani , Gül Varol , Andrew Zisserman


   Access Paper or Ask Questions

Learning Audio-Video Modalities from Image Captions



Arsha Nagrani , Paul Hongsuck Seo , Bryan Seybold , Anja Hauth , Santiago Manen , Chen Sun , Cordelia Schmid


   Access Paper or Ask Questions

End-to-end Generative Pretraining for Multimodal Video Captioning



Paul Hongsuck Seo , Arsha Nagrani , Anurag Arnab , Cordelia Schmid


   Access Paper or Ask Questions

VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge



Andrew Brown , Jaesung Huh , Joon Son Chung , Arsha Nagrani , Andrew Zisserman

* arXiv admin note: substantial text overlap with arXiv:2012.06867 

   Access Paper or Ask Questions

Audio-Visual Synchronisation in the wild



Honglie Chen , Weidi Xie , Triantafyllos Afouras , Arsha Nagrani , Andrea Vedaldi , Andrew Zisserman


   Access Paper or Ask Questions

Masking Modalities for Cross-modal Video Retrieval



Valentin Gabeur , Arsha Nagrani , Chen Sun , Karteek Alahari , Cordelia Schmid

* Accepted at WACV 2022 

   Access Paper or Ask Questions

With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition



Evangelos Kazakos , Jaesung Huh , Arsha Nagrani , Andrew Zisserman , Dima Damen

* Accepted at BMVC 2021 

   Access Paper or Ask Questions

1
2
3
4
>>