Picture for Kai Yu

Kai Yu

Sherman

X-VC: Zero-shot Streaming Voice Conversion in Codec Space

Add code
Apr 14, 2026
Viaarxiv icon

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Add code
Apr 14, 2026
Viaarxiv icon

TASU2: Controllable CTC Simulation for Alignment and Low-Resource Adaptation of Speech LLMs

Add code
Apr 09, 2026
Viaarxiv icon

Does Pass Rate Tell the Whole Story? Evaluating Design Constraint Compliance in LLM-based Issue Resolution

Add code
Apr 07, 2026
Viaarxiv icon

PRIME: Prototype-Driven Multimodal Pretraining for Cancer Prognosis with Missing Modalities

Add code
Apr 05, 2026
Viaarxiv icon

CharTool: Tool-Integrated Visual Reasoning for Chart Understanding

Add code
Apr 03, 2026
Viaarxiv icon

EpiScreen: Early Epilepsy Detection from Electronic Health Records with Large Language Models

Add code
Mar 30, 2026
Viaarxiv icon

SoulX-Duplug: Plug-and-Play Streaming State Prediction Module for Realtime Full-Duplex Speech Conversation

Add code
Mar 16, 2026
Viaarxiv icon

HeartAgent: An Autonomous Agent System for Explainable Differential Diagnosis in Cardiology

Add code
Mar 11, 2026
Viaarxiv icon

G-STAR: End-to-End Global Speaker-Tracking Attributed Recognition

Add code
Mar 11, 2026
Viaarxiv icon