Picture for Haoning Xu

Haoning Xu

Confidence Score Guided Incremental and Speaker Adaptive Pseudo-Labeling for Semi-Supervised Elderly Speech Recognition

Add code
Jun 15, 2026
Viaarxiv icon

Decoding while Adapting: Zero-Shot Online Speaker Adaptation via Audio-Textual Prompts for Elderly Speech Recognition

Add code
Jun 15, 2026
Viaarxiv icon

Towards Data-free and Training-free Compression for Speech Foundation Models Using Parameter Clustering

Add code
Jun 11, 2026
Viaarxiv icon

UNISON: A Unified Sound Generation and Editing Framework via Deep LLM Fusion

Add code
May 29, 2026
Viaarxiv icon

MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models

Add code
Nov 13, 2025
Figure 1 for MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models
Figure 2 for MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models
Figure 3 for MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models
Figure 4 for MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models
Viaarxiv icon

MOPSA: Mixture of Prompt-Experts Based Speaker Adaptation for Elderly Speech Recognition

Add code
May 30, 2025
Figure 1 for MOPSA: Mixture of Prompt-Experts Based Speaker Adaptation for Elderly Speech Recognition
Figure 2 for MOPSA: Mixture of Prompt-Experts Based Speaker Adaptation for Elderly Speech Recognition
Figure 3 for MOPSA: Mixture of Prompt-Experts Based Speaker Adaptation for Elderly Speech Recognition
Figure 4 for MOPSA: Mixture of Prompt-Experts Based Speaker Adaptation for Elderly Speech Recognition
Viaarxiv icon

Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition

Add code
May 29, 2025
Figure 1 for Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition
Figure 2 for Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition
Figure 3 for Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition
Figure 4 for Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition
Viaarxiv icon

Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates

Add code
May 28, 2025
Figure 1 for Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates
Figure 2 for Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates
Figure 3 for Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates
Figure 4 for Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates
Viaarxiv icon

Towards One-bit ASR: Extremely Low-bit Conformer Quantization Using Co-training and Stochastic Precision

Add code
May 27, 2025
Viaarxiv icon

Unfolding A Few Structures for The Many: Memory-Efficient Compression of Conformer and Speech Foundation Models

Add code
May 27, 2025
Figure 1 for Unfolding A Few Structures for The Many: Memory-Efficient Compression of Conformer and Speech Foundation Models
Figure 2 for Unfolding A Few Structures for The Many: Memory-Efficient Compression of Conformer and Speech Foundation Models
Figure 3 for Unfolding A Few Structures for The Many: Memory-Efficient Compression of Conformer and Speech Foundation Models
Figure 4 for Unfolding A Few Structures for The Many: Memory-Efficient Compression of Conformer and Speech Foundation Models
Viaarxiv icon