Picture for Jinyu Li

Jinyu Li

Beijing Institute of Technology, China

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Add code
Mar 05, 2024
Viaarxiv icon

Boosting Large Language Model for Speech Synthesis: An Empirical Study

Add code
Dec 30, 2023
Viaarxiv icon

COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning

Add code
Nov 03, 2023
Figure 1 for COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Figure 2 for COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Figure 3 for COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Figure 4 for COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Viaarxiv icon

Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation

Add code
Oct 23, 2023
Figure 1 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Figure 2 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Figure 3 for Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Viaarxiv icon

RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments

Add code
Oct 23, 2023
Figure 1 for RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments
Figure 2 for RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments
Figure 3 for RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments
Figure 4 for RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments
Viaarxiv icon

Enhanced Edge-Perceptual Guided Image Filtering

Add code
Oct 16, 2023
Figure 1 for Enhanced Edge-Perceptual Guided Image Filtering
Figure 2 for Enhanced Edge-Perceptual Guided Image Filtering
Figure 3 for Enhanced Edge-Perceptual Guided Image Filtering
Figure 4 for Enhanced Edge-Perceptual Guided Image Filtering
Viaarxiv icon

Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach

Add code
Oct 06, 2023
Figure 1 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 2 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 3 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 4 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Viaarxiv icon

ResidualTransformer: Residual Low-rank Learning with Weight-sharing for Transformer Layers

Add code
Oct 03, 2023
Viaarxiv icon

t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability

Add code
Sep 15, 2023
Viaarxiv icon

DiariST: Streaming Speech Translation with Speaker Diarization

Add code
Sep 14, 2023
Figure 1 for DiariST: Streaming Speech Translation with Speaker Diarization
Figure 2 for DiariST: Streaming Speech Translation with Speaker Diarization
Viaarxiv icon