Picture for Shujie Liu

Shujie Liu

Target Sound Extraction with Variable Cross-modality Clues

Add code
Mar 15, 2023
Viaarxiv icon

Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

Add code
Mar 07, 2023
Viaarxiv icon

Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

Add code
Mar 01, 2023
Figure 1 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 2 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 3 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 4 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Viaarxiv icon

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Add code
Jan 05, 2023
Figure 1 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 2 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 3 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 4 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Viaarxiv icon

BEATs: Audio Pre-Training with Acoustic Tokenizers

Add code
Dec 18, 2022
Viaarxiv icon

VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning

Add code
Nov 21, 2022
Figure 1 for VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning
Figure 2 for VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning
Figure 3 for VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning
Figure 4 for VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning
Viaarxiv icon

Exploring WavLM on Speech Enhancement

Add code
Nov 18, 2022
Viaarxiv icon

LongFNT: Long-form Speech Recognition with Factorized Neural Transducer

Add code
Nov 17, 2022
Viaarxiv icon

LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers

Add code
Nov 05, 2022
Figure 1 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 2 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 3 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 4 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Viaarxiv icon

Two-Stream Network for Sign Language Recognition and Translation

Add code
Nov 02, 2022
Viaarxiv icon