Picture for Jason Li

Jason Li

Sandy

HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset

Add code
Jun 04, 2025
Viaarxiv icon

Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model

Add code
May 21, 2025
Viaarxiv icon

Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance

Add code
Feb 07, 2025
Viaarxiv icon

TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer

Add code
Jan 10, 2025
Viaarxiv icon

Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference

Add code
Sep 18, 2024
Figure 1 for Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference
Viaarxiv icon

Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment

Add code
Jun 25, 2024
Figure 1 for Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
Figure 2 for Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
Figure 3 for Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
Figure 4 for Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
Viaarxiv icon

Mechanistic Interpretability of Binary and Ternary Transformers

Add code
May 27, 2024
Figure 1 for Mechanistic Interpretability of Binary and Ternary Transformers
Figure 2 for Mechanistic Interpretability of Binary and Ternary Transformers
Figure 3 for Mechanistic Interpretability of Binary and Ternary Transformers
Figure 4 for Mechanistic Interpretability of Binary and Ternary Transformers
Viaarxiv icon

zkLLM: Zero Knowledge Proofs for Large Language Models

Add code
Apr 24, 2024
Viaarxiv icon

SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation

Add code
Oct 13, 2023
Figure 1 for SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Figure 2 for SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Figure 3 for SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Figure 4 for SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Viaarxiv icon

AstroLLaMA: Towards Specialized Foundation Models in Astronomy

Add code
Sep 12, 2023
Figure 1 for AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Figure 2 for AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Figure 3 for AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Viaarxiv icon