Picture for Kai Yu

Kai Yu

Sherman

Recent Advances in Discrete Speech Tokens: A Review

Add code
Feb 10, 2025
Viaarxiv icon

From Generalist to Specialist: A Survey of Large Language Models for Chemistry

Add code
Dec 28, 2024
Figure 1 for From Generalist to Specialist: A Survey of Large Language Models for Chemistry
Figure 2 for From Generalist to Specialist: A Survey of Large Language Models for Chemistry
Figure 3 for From Generalist to Specialist: A Survey of Large Language Models for Chemistry
Figure 4 for From Generalist to Specialist: A Survey of Large Language Models for Chemistry
Viaarxiv icon

AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures

Add code
Dec 25, 2024
Figure 1 for AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures
Figure 2 for AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures
Figure 3 for AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures
Figure 4 for AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures
Viaarxiv icon

Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario

Add code
Dec 24, 2024
Figure 1 for Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
Figure 2 for Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
Figure 3 for Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
Figure 4 for Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
Viaarxiv icon

Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective

Add code
Dec 22, 2024
Figure 1 for Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective
Figure 2 for Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective
Figure 3 for Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective
Figure 4 for Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective
Viaarxiv icon

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

Add code
Dec 20, 2024
Figure 1 for SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
Figure 2 for SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
Figure 3 for SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
Figure 4 for SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
Viaarxiv icon

NTC-KWS: Noise-aware CTC for Robust Keyword Spotting

Add code
Dec 17, 2024
Viaarxiv icon

Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency

Add code
Dec 17, 2024
Figure 1 for Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
Figure 2 for Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
Figure 3 for Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
Figure 4 for Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
Viaarxiv icon

VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization

Add code
Dec 13, 2024
Figure 1 for VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization
Figure 2 for VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization
Figure 3 for VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization
Figure 4 for VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization
Viaarxiv icon

Reducing Tool Hallucination via Reliability Alignment

Add code
Dec 05, 2024
Figure 1 for Reducing Tool Hallucination via Reliability Alignment
Figure 2 for Reducing Tool Hallucination via Reliability Alignment
Figure 3 for Reducing Tool Hallucination via Reliability Alignment
Figure 4 for Reducing Tool Hallucination via Reliability Alignment
Viaarxiv icon