Alert button
Picture for Otavio Braga

Otavio Braga

Alert button

Google Inc

On Robustness to Missing Video for Audiovisual Speech Recognition

Add code
Bookmark button
Alert button
Dec 19, 2023
Oscar Chang, Otavio Braga, Hank Liao, Dmitriy Serdyuk, Olivier Siohan

Figure 1 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 2 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 3 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 4 for On Robustness to Missing Video for Audiovisual Speech Recognition
Viaarxiv icon

Audio-visual fine-tuning of audio-only ASR models

Add code
Bookmark button
Alert button
Dec 14, 2023
Avner May, Dmitriy Serdyuk, Ankit Parag Shah, Otavio Braga, Olivier Siohan

Viaarxiv icon

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition

Add code
Bookmark button
Alert button
May 11, 2022
Otavio Braga, Takaki Makino, Olivier Siohan, Hank Liao

Figure 1 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 2 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 3 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 4 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Viaarxiv icon

A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection

Add code
Bookmark button
Alert button
May 11, 2022
Otavio Braga, Olivier Siohan

Figure 1 for A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Figure 2 for A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Figure 3 for A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Figure 4 for A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Viaarxiv icon

Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection

Add code
Bookmark button
Alert button
May 10, 2022
Otavio Braga, Olivier Siohan

Figure 1 for Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
Figure 2 for Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
Figure 3 for Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
Viaarxiv icon

Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Jan 25, 2022
Dmitriy Serdyuk, Otavio Braga, Olivier Siohan

Figure 1 for Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition
Figure 2 for Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition
Figure 3 for Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition
Figure 4 for Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition
Viaarxiv icon

Audio-Visual Speech Recognition is Worth 32$\times$32$\times$8 Voxels

Add code
Bookmark button
Alert button
Sep 20, 2021
Dmitriy Serdyuk, Otavio Braga, Olivier Siohan

Figure 1 for Audio-Visual Speech Recognition is Worth 32$\times$32$\times$8 Voxels
Figure 2 for Audio-Visual Speech Recognition is Worth 32$\times$32$\times$8 Voxels
Figure 3 for Audio-Visual Speech Recognition is Worth 32$\times$32$\times$8 Voxels
Figure 4 for Audio-Visual Speech Recognition is Worth 32$\times$32$\times$8 Voxels
Viaarxiv icon

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Nov 08, 2019
Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan

Figure 1 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 2 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 3 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 4 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Viaarxiv icon