Picture for Zongli Ye

Zongli Ye

Asymmetric Hierarchical Anchoring for Audio-Visual Joint Representation: Resolving Information Allocation Ambiguity for Robust Cross-Modal Generalization

Add code
Feb 03, 2026
Viaarxiv icon

LCS-CTC: Leveraging Soft Alignments to Enhance Phonetic Transcription Robustness

Add code
Aug 05, 2025
Viaarxiv icon

Dysfluent WFST: A Framework for Zero-Shot Speech Dysfluency Transcription and Detection

Add code
May 22, 2025
Figure 1 for Dysfluent WFST: A Framework for Zero-Shot Speech Dysfluency Transcription and Detection
Figure 2 for Dysfluent WFST: A Framework for Zero-Shot Speech Dysfluency Transcription and Detection
Figure 3 for Dysfluent WFST: A Framework for Zero-Shot Speech Dysfluency Transcription and Detection
Figure 4 for Dysfluent WFST: A Framework for Zero-Shot Speech Dysfluency Transcription and Detection
Viaarxiv icon