Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Continual Learning for On-Device Speech Recognition using Disentangled Conformers


Dec 13, 2022
Anuj Diwan, Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Eunsol Choi, David Harwath, Abdelrahman Mohamed

Add code

* 8 pages, 2 figures. Submitted to ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models


Dec 03, 2022
Reem Gody, David Harwath

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality


Nov 11, 2022
Anuj Diwan, Layne Berry, Eunsol Choi, David Harwath, Kyle Mahowald

Add code

* Accepted at EMNLP 2022. We release our annotation and code at https://github.com/ajd12342/why-winoground-hard . 15 pages, 3 figures 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Phoneme Segmentation Using Self-Supervised Speech Models


Nov 02, 2022
Luke Strgar, David Harwath

Add code

* Accepted to SLT 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval


Nov 02, 2022
Layne Berry, Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Hung-yi Lee, David Harwath

Add code

* Submitted to ICASSP 2023 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval


Oct 07, 2022
Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogerio Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James Glass

Add code


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model


Oct 03, 2022
Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Layne Berry, Hung-yi Lee, David Harwath

Add code

* Accepted to IEEE SLT 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

MAE-AST: Masked Autoencoding Audio Spectrogram Transformer


Mar 30, 2022
Alan Baade, Puyuan Peng, David Harwath

Add code

* Submitted to INTERSPEECH. 5 pages, 2 figures, 5 tables 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
3
4
>>