Picture for Peidong Wang

Peidong Wang

Improving Speech Recognition Error Prediction for Modern and Off-the-shelf Speech Recognizers

Add code
Aug 21, 2024
Figure 1 for Improving Speech Recognition Error Prediction for Modern and Off-the-shelf Speech Recognizers
Figure 2 for Improving Speech Recognition Error Prediction for Modern and Off-the-shelf Speech Recognizers
Figure 3 for Improving Speech Recognition Error Prediction for Modern and Off-the-shelf Speech Recognizers
Viaarxiv icon

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation

Add code
Jun 12, 2024
Figure 1 for Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
Figure 2 for Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
Figure 3 for Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
Viaarxiv icon

Variational Optimization for Quantum Problems using Deep Generative Networks

Add code
Apr 28, 2024
Figure 1 for Variational Optimization for Quantum Problems using Deep Generative Networks
Figure 2 for Variational Optimization for Quantum Problems using Deep Generative Networks
Figure 3 for Variational Optimization for Quantum Problems using Deep Generative Networks
Figure 4 for Variational Optimization for Quantum Problems using Deep Generative Networks
Viaarxiv icon

StickerConv: Generating Multimodal Empathetic Responses from Scratch

Add code
Jan 20, 2024
Viaarxiv icon

Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation

Add code
Oct 23, 2023
Figure 1 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Figure 2 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Figure 3 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Viaarxiv icon

Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach

Add code
Oct 06, 2023
Figure 1 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 2 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 3 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 4 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Viaarxiv icon

DiariST: Streaming Speech Translation with Speaker Diarization

Add code
Sep 14, 2023
Figure 1 for DiariST: Streaming Speech Translation with Speaker Diarization
Figure 2 for DiariST: Streaming Speech Translation with Speaker Diarization
Viaarxiv icon

Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

Add code
Mar 01, 2023
Figure 1 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 2 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 3 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 4 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Viaarxiv icon

Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition

Add code
Nov 10, 2022
Figure 1 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 2 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 3 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 4 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Viaarxiv icon

LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers

Add code
Nov 05, 2022
Figure 1 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 2 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 3 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 4 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Viaarxiv icon