Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval


Oct 07, 2022
Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogerio Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James Glass

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

UAVM: A Unified Model for Audio-Visual Learning


Jul 29, 2022
Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James Glass

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification


Mar 13, 2022
Yuan Gong, Sameer Khurana, Andrew Rouditchenko, James Glass

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval


Dec 08, 2021
Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Hilde Kuehne

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Routing with Self-Attention for Multimodal Capsule Networks


Dec 01, 2021
Kevin Duarte, Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Samuel Thomas, Alexander Liu, David Harwath, James Glass, Hilde Kuehne, Mubarak Shah

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Cascaded Multilingual Audio-Visual Learning from Videos


Nov 08, 2021
Andrew Rouditchenko, Angie Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, James Glass

Add code

* Presented at Interspeech 2021. This version contains updated results using the YouCook-Japanese dataset 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset


Oct 14, 2021
Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James Glass

Add code

* Presented at Interspeech 2021. This version contains additional experiments on the Spoken ObjectNet test set 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Cross-Modal Discrete Representation Learning


Jun 10, 2021
Alexander H. Liu, SouYoung Jin, Cheng-I Jeff Lai, Andrew Rouditchenko, Aude Oliva, James Glass

Add code

* Preprint 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos


May 05, 2021
Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie Boggust, Rameswar Panda, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Michael Picheny, Shih-Fu Chang

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
>>