Picture for Yu Wu

Yu Wu

Wuhan University

LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers

Add code
Nov 05, 2022
Figure 1 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 2 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 3 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Figure 4 for LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Viaarxiv icon

Two-Stream Network for Sign Language Recognition and Translation

Add code
Nov 02, 2022
Figure 1 for Two-Stream Network for Sign Language Recognition and Translation
Figure 2 for Two-Stream Network for Sign Language Recognition and Translation
Figure 3 for Two-Stream Network for Sign Language Recognition and Translation
Figure 4 for Two-Stream Network for Sign Language Recognition and Translation
Viaarxiv icon

Real-time Speech Interruption Analysis: From Cloud to Client Deployment

Add code
Oct 24, 2022
Viaarxiv icon

Foundation Transformers

Add code
Oct 19, 2022
Figure 1 for Foundation Transformers
Figure 2 for Foundation Transformers
Figure 3 for Foundation Transformers
Figure 4 for Foundation Transformers
Viaarxiv icon

STAR: Zero-Shot Chinese Character Recognition with Stroke- and Radical-Level Decompositions

Add code
Oct 16, 2022
Figure 1 for STAR: Zero-Shot Chinese Character Recognition with Stroke- and Radical-Level Decompositions
Figure 2 for STAR: Zero-Shot Chinese Character Recognition with Stroke- and Radical-Level Decompositions
Figure 3 for STAR: Zero-Shot Chinese Character Recognition with Stroke- and Radical-Level Decompositions
Figure 4 for STAR: Zero-Shot Chinese Character Recognition with Stroke- and Radical-Level Decompositions
Viaarxiv icon

Vision+X: A Survey on Multimodal Learning in the Light of Data

Add code
Oct 05, 2022
Figure 1 for Vision+X: A Survey on Multimodal Learning in the Light of Data
Figure 2 for Vision+X: A Survey on Multimodal Learning in the Light of Data
Figure 3 for Vision+X: A Survey on Multimodal Learning in the Light of Data
Figure 4 for Vision+X: A Survey on Multimodal Learning in the Light of Data
Viaarxiv icon

SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

Add code
Sep 30, 2022
Figure 1 for SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Figure 2 for SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Figure 3 for SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Figure 4 for SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Viaarxiv icon

SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding

Add code
Jul 27, 2022
Figure 1 for SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding
Figure 2 for SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding
Figure 3 for SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding
Figure 4 for SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding
Viaarxiv icon

Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization

Add code
Jun 29, 2022
Figure 1 for Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization
Figure 2 for Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization
Figure 3 for Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization
Viaarxiv icon

Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training

Add code
Jun 21, 2022
Figure 1 for Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Figure 2 for Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Viaarxiv icon