Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition



Niko Moritz , Frank Seide , Duc Le , Jay Mahadeokar , Christian Fuegen

* Submitted to Interspeech 2022 

   Access Paper or Ask Questions

Scaling ASR Improves Zero and Few Shot Learning



Alex Xiao , Weiyi Zheng , Gil Keren , Duc Le , Frank Zhang , Christian Fuegen , Ozlem Kalinli , Yatharth Saraf , Abdelrahman Mohamed


   Access Paper or Ask Questions

Ego4D: Around the World in 3,000 Hours of Egocentric Video



Kristen Grauman , Andrew Westbury , Eugene Byrne , Zachary Chavis , Antonino Furnari , Rohit Girdhar , Jackson Hamburger , Hao Jiang , Miao Liu , Xingyu Liu , Miguel Martin , Tushar Nagarajan , Ilija Radosavovic , Santhosh Kumar Ramakrishnan , Fiona Ryan , Jayant Sharma , Michael Wray , Mengmeng Xu , Eric Zhongcong Xu , Chen Zhao , Siddhant Bansal , Dhruv Batra , Vincent Cartillier , Sean Crane , Tien Do , Morrie Doulaty , Akshay Erapalli , Christoph Feichtenhofer , Adriano Fragomeni , Qichen Fu , Christian Fuegen , Abrham Gebreselasie , Cristina Gonzalez , James Hillis , Xuhua Huang , Yifei Huang , Wenqi Jia , Weslie Khoo , Jachym Kolar , Satwik Kottur , Anurag Kumar , Federico Landini , Chao Li , Yanghao Li , Zhenqiang Li , Karttikeya Mangalam , Raghava Modhugu , Jonathan Munro , Tullie Murrell , Takumi Nishiyasu , Will Price , Paola Ruiz Puentes , Merey Ramazanova , Leda Sari , Kiran Somasundaram , Audrey Southerland , Yusuke Sugano , Ruijie Tao , Minh Vo , Yuchen Wang , Xindi Wu , Takuma Yagi , Yunyi Zhu , Pablo Arbelaez , David Crandall , Dima Damen , Giovanni Maria Farinella , Bernard Ghanem , Vamsi Krishna Ithapu , C. V. Jawahar , Hanbyul Joo , Kris Kitani , Haizhou Li , Richard Newcombe , Aude Oliva , Hyun Soo Park , James M. Rehg , Yoichi Sato , Jianbo Shi , Mike Zheng Shou , Antonio Torralba , Lorenzo Torresani , Mingfei Yan , Jitendra Malik


   Access Paper or Ask Questions

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric



Suyoun Kim , Duc Le , Weiyi Zheng , Tarun Singh , Abhinav Arora , Xiaoyu Zhai , Christian Fuegen , Ozlem Kalinli , Michael L. Seltzer

* submitted 2022 ICASSP 

   Access Paper or Ask Questions

Do sound event representations generalize to other audio tasks? A case study in audio transfer learning



Anurag Kumar , Yun Wang , Vamsi Krishna Ithapu , Christian Fuegen

* Accepted Interspeech 2021 

   Access Paper or Ask Questions

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios



Jay Mahadeokar , Yangyang Shi , Yuan Shangguan , Chunyang Wu , Alex Xiao , Hang Su , Duc Le , Ozlem Kalinli , Christian Fuegen , Michael L. Seltzer

* Submitted to Interspeech 2021 (under review) 

   Access Paper or Ask Questions

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition



Yuan Shangguan , Rohit Prabhavalkar , Hang Su , Jay Mahadeokar , Yangyang Shi , Jiatong Zhou , Chunyang Wu , Duc Le , Ozlem Kalinli , Christian Fuegen , Michael L. Seltzer

* Submitted to Interspeech 2021 

   Access Paper or Ask Questions

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion



Duc Le , Mahaveer Jain , Gil Keren , Suyoun Kim , Yangyang Shi , Jay Mahadeokar , Julian Chan , Yuan Shangguan , Christian Fuegen , Ozlem Kalinli , Yatharth Saraf , Michael L. Seltzer

* Submitted to INTERSPEECH 2021 

   Access Paper or Ask Questions

1
2
3
>>