Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Michael Picheny

Michael Picheny

Courant Computer Science and Center for Data Science, New York University

Accent-Robust Automatic Speech Recognition Using Supervised and Unsupervised Wav2vec Embeddings


Oct 08, 2021
Jialu Li, Vimal Manohar, Pooja Chitkara, Andros Tjandra, Michael Picheny, Frank Zhang, Xiaohui Zhang, Yatharth Saraf

* Submitted to ICASSP 2022 

  Access Paper or Ask Questions

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos


May 05, 2021
Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie Boggust, Rameswar Panda, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Michael Picheny, Shih-Fu Chang


  Access Paper or Ask Questions

Accented Speech Recognition Inspired by Human Perception


Apr 09, 2021
Xiangyun Chu, Elizabeth Combs, Amber Wang, Michael Picheny

* Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs


Apr 07, 2021
Sujeong Cha, Wangrui Hou, Hyun Jung, My Phung, Michael Picheny, Hong-Kwang Kuo, Samuel Thomas, Edmilson Morais

* Submitted to Interspeech 2021 

  Access Paper or Ask Questions

Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio


Apr 03, 2021
Jeffrey Tumminia, Amanda Kuznecov, Sophia Tsilerides, Ilana Weinstein, Brian McFee, Michael Picheny, Aaron R. Kaufman

* Under review for InterSpeech 2021 

  Access Paper or Ask Questions

Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems


Oct 08, 2020
Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny

* 5 pages, published in ICASSP 2020 

  Access Paper or Ask Questions

AVLnet: Learning Audio-Visual Language Representations from Instructional Videos


Jun 16, 2020
Andrew Rouditchenko, Angie Boggust, David Harwath, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Rogerio Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James Glass


  Access Paper or Ask Questions

Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition


Feb 24, 2020
Xiaodong Cui, Wei Zhang, Ulrich Finkler, George Saon, Michael Picheny, David Kung

* Accepted to IEEE Signal Processing Magazine 

  Access Paper or Ask Questions

Improving Efficiency in Large-Scale Decentralized Distributed Training


Feb 04, 2020
Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David Kung, Michael Picheny

* 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP'2020) Oral 

  Access Paper or Ask Questions

Challenging the Boundaries of Speech Recognition: The MALACH Corpus


Aug 09, 2019
Michael Picheny, Zóltan Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon

* Accepted for publication at INTERSPEECH 2019 

  Access Paper or Ask Questions

Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition


Jul 10, 2019
Khoi-Nguyen C. Mac, Xiaodong Cui, Wei Zhang, Michael Picheny

* Interspeech 2019 

  Access Paper or Ask Questions

Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition


Jul 10, 2019
Xiaodong Cui, Michael Picheny

* Interspeech 2019 

  Access Paper or Ask Questions

A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition


Jul 10, 2019
Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David Kung, Michael Picheny

* INTERSPEECH 2019 

  Access Paper or Ask Questions

English Broadcast News Speech Recognition by Humans and Machines


Apr 30, 2019
Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltan Tuske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko

* \copyright 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works 

  Access Paper or Ask Questions

Distributed Deep Learning Strategies For Automatic Speech Recognition


Apr 10, 2019
Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David Kung, Michael Picheny

* Published in ICASSP'19 

  Access Paper or Ask Questions

Acoustically Grounded Word Embeddings for Improved Acoustics-to-Word Speech Recognition


Mar 29, 2019
Shane Settle, Kartik Audhkhasi, Karen Livescu, Michael Picheny

* To appear at ICASSP 2019 

  Access Paper or Ask Questions

Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks


Oct 16, 2018
Xiaodong Cui, Wei Zhang, Zoltán Tüske, Michael Picheny


  Access Paper or Ask Questions

Building competitive direct acoustics-to-word models for English conversational speech recognition


Dec 08, 2017
Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny

* Submitted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018 

  Access Paper or Ask Questions

Direct Acoustics-to-Word Models for English Conversational Speech Recognition


Mar 22, 2017
Kartik Audhkhasi, Bhuvana Ramabhadran, George Saon, Michael Picheny, David Nahamoo

* Submitted to Interspeech-2017 

  Access Paper or Ask Questions

English Conversational Telephone Speech Recognition by Humans and Machines


Mar 06, 2017
George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall


  Access Paper or Ask Questions

Kernel Approximation Methods for Speech Recognition


Jan 13, 2017
Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel Hsu, Brian Kingsbury, Michael Picheny, Fei Sha


  Access Paper or Ask Questions

Training variance and performance evaluation of neural networks in speech


Jun 14, 2016
Ewout van den Berg, Bhuvana Ramabhadran, Michael Picheny


  Access Paper or Ask Questions

A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition


Mar 18, 2016
Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, Kuan Liu, Avner May, Aurelien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha

* arXiv admin note: text overlap with arXiv:1411.4000 

  Access Paper or Ask Questions

How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets


Jun 17, 2015
Zhiyun Lu, Avner May, Kuan Liu, Alireza Bagheri Garakani, Dong Guo, Aurélien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha


  Access Paper or Ask Questions

The IBM 2015 English Conversational Telephone Speech Recognition System


May 21, 2015
George Saon, Hong-Kwang J. Kuo, Steven Rennie, Michael Picheny

* Submitted to Interspeech 2015 

  Access Paper or Ask Questions