Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Picture for Olivier Siohan

Olivier Siohan

Google Inc

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition



Otavio Braga , Takaki Makino , Olivier Siohan , Hank Liao


   Access Paper or Ask Questions

A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection



Otavio Braga , Olivier Siohan

* arXiv admin note: text overlap with arXiv:2205.05586 

   Access Paper or Ask Questions

Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection



Otavio Braga , Olivier Siohan


   Access Paper or Ask Questions

End-to-end multi-talker audio-visual ASR using an active speaker attention module



Richard Rose , Olivier Siohan

* 5 pages, 3 figures, 3 tables, 28 citations 

   Access Paper or Ask Questions

Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition



Dmitriy Serdyuk , Otavio Braga , Olivier Siohan


   Access Paper or Ask Questions

Audio-Visual Speech Recognition is Worth 32$\times$32$\times$8 Voxels



Dmitriy Serdyuk , Otavio Braga , Olivier Siohan

* 7 pages, 2 figures, 4 tables. A draft for a paper accepted to ASRU workshop 

   Access Paper or Ask Questions

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models



Thibault Doutre , Wei Han , Chung-Cheng Chiu , Ruoming Pang , Olivier Siohan , Liangliang Cao


   Access Paper or Ask Questions

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition



Takaki Makino , Hank Liao , Yannis Assael , Brendan Shillingford , Basilio Garcia , Otavio Braga , Olivier Siohan

* Will be presented in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2019) 

   Access Paper or Ask Questions