Picture for Olivier Siohan

Olivier Siohan

Google Inc

Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses

Add code
Sep 17, 2025
Viaarxiv icon

On Robustness to Missing Video for Audiovisual Speech Recognition

Add code
Dec 19, 2023
Figure 1 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 2 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 3 for On Robustness to Missing Video for Audiovisual Speech Recognition
Figure 4 for On Robustness to Missing Video for Audiovisual Speech Recognition
Viaarxiv icon

Revisiting the Entropy Semiring for Neural Speech Recognition

Add code
Dec 19, 2023
Figure 1 for Revisiting the Entropy Semiring for Neural Speech Recognition
Figure 2 for Revisiting the Entropy Semiring for Neural Speech Recognition
Figure 3 for Revisiting the Entropy Semiring for Neural Speech Recognition
Figure 4 for Revisiting the Entropy Semiring for Neural Speech Recognition
Viaarxiv icon

Audio-visual fine-tuning of audio-only ASR models

Add code
Dec 14, 2023
Figure 1 for Audio-visual fine-tuning of audio-only ASR models
Figure 2 for Audio-visual fine-tuning of audio-only ASR models
Figure 3 for Audio-visual fine-tuning of audio-only ASR models
Viaarxiv icon

Cascaded encoders for fine-tuning ASR models on overlapped speech

Add code
Jun 28, 2023
Viaarxiv icon

Conformers are All You Need for Visual Speech Recogntion

Add code
Feb 17, 2023
Figure 1 for Conformers are All You Need for Visual Speech Recogntion
Figure 2 for Conformers are All You Need for Visual Speech Recogntion
Figure 3 for Conformers are All You Need for Visual Speech Recogntion
Figure 4 for Conformers are All You Need for Visual Speech Recogntion
Viaarxiv icon

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition

Add code
May 11, 2022
Figure 1 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 2 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 3 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Figure 4 for End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Viaarxiv icon

A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection

Add code
May 11, 2022
Figure 1 for A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Figure 2 for A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Figure 3 for A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Figure 4 for A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Viaarxiv icon

Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection

Add code
May 10, 2022
Figure 1 for Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
Figure 2 for Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
Figure 3 for Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
Viaarxiv icon

End-to-end multi-talker audio-visual ASR using an active speaker attention module

Add code
Apr 01, 2022
Figure 1 for End-to-end multi-talker audio-visual ASR using an active speaker attention module
Figure 2 for End-to-end multi-talker audio-visual ASR using an active speaker attention module
Figure 3 for End-to-end multi-talker audio-visual ASR using an active speaker attention module
Figure 4 for End-to-end multi-talker audio-visual ASR using an active speaker attention module
Viaarxiv icon