Picture for Yiwen Shao

Yiwen Shao

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Oct 31, 2023
Figure 1 for RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Figure 2 for RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Figure 3 for RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Figure 4 for RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Viaarxiv icon

UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing

Oct 25, 2023
Viaarxiv icon

Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset

Oct 05, 2023
Figure 1 for Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Figure 2 for Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Figure 3 for Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Figure 4 for Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Viaarxiv icon

Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser

Add code
Apr 08, 2022
Figure 1 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 2 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 3 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 4 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Viaarxiv icon

Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature

Nov 22, 2021
Figure 1 for Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature
Figure 2 for Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature
Figure 3 for Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature
Figure 4 for Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature
Viaarxiv icon

Adversarial Attacks and Defenses for Speech Recognition Systems

Add code
Mar 31, 2021
Figure 1 for Adversarial Attacks and Defenses for Speech Recognition Systems
Viaarxiv icon

PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR

Add code
May 20, 2020
Figure 1 for PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR
Figure 2 for PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR
Figure 3 for PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR
Figure 4 for PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR
Viaarxiv icon

Espresso: A Fast End-to-end Neural Speech Recognition Toolkit

Add code
Oct 15, 2019
Figure 1 for Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Figure 2 for Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Figure 3 for Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Figure 4 for Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Viaarxiv icon