Picture for Xixin Wu

Xixin Wu

TreePS-RAG: Tree-based Process Supervision for Reinforcement Learning in Agentic RAG

Add code
Jan 11, 2026
Viaarxiv icon

ELEGANCE: Efficient LLM Guidance for Audio-Visual Target Speech Extraction

Add code
Nov 09, 2025
Viaarxiv icon

DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models

Add code
Aug 12, 2025
Viaarxiv icon

Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction

Add code
Jun 11, 2025
Viaarxiv icon

Naturalistic Language-related Movie-Watching fMRI Task for Detecting Neurocognitive Decline and Disorder

Add code
Jun 10, 2025
Viaarxiv icon

WAKE: Watermarking Audio with Key Enrichment

Add code
Jun 06, 2025
Figure 1 for WAKE: Watermarking Audio with Key Enrichment
Figure 2 for WAKE: Watermarking Audio with Key Enrichment
Figure 3 for WAKE: Watermarking Audio with Key Enrichment
Figure 4 for WAKE: Watermarking Audio with Key Enrichment
Viaarxiv icon

RAG-Zeval: Towards Robust and Interpretable Evaluation on RAG Responses through End-to-End Rule-Guided Reasoning

Add code
May 28, 2025
Figure 1 for RAG-Zeval: Towards Robust and Interpretable Evaluation on RAG Responses through End-to-End Rule-Guided Reasoning
Figure 2 for RAG-Zeval: Towards Robust and Interpretable Evaluation on RAG Responses through End-to-End Rule-Guided Reasoning
Figure 3 for RAG-Zeval: Towards Robust and Interpretable Evaluation on RAG Responses through End-to-End Rule-Guided Reasoning
Figure 4 for RAG-Zeval: Towards Robust and Interpretable Evaluation on RAG Responses through End-to-End Rule-Guided Reasoning
Viaarxiv icon

Enhancing Generalization of Speech Large Language Models with Multi-Task Behavior Imitation and Speech-Text Interleaving

Add code
May 24, 2025
Viaarxiv icon

$C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction

Add code
Apr 01, 2025
Viaarxiv icon

UniSep: Universal Target Audio Separation with Language Models at Scale

Add code
Mar 31, 2025
Figure 1 for UniSep: Universal Target Audio Separation with Language Models at Scale
Figure 2 for UniSep: Universal Target Audio Separation with Language Models at Scale
Figure 3 for UniSep: Universal Target Audio Separation with Language Models at Scale
Figure 4 for UniSep: Universal Target Audio Separation with Language Models at Scale
Viaarxiv icon