Picture for Yang Xiang

Yang Xiang

Evaluating and Steering Modality Preferences in Multimodal Large Language Model

Add code
May 27, 2025
Viaarxiv icon

XBOUND: Exploring the Capability Boundaries of Device-Control Agents through Trajectory Tree Exploration

Add code
May 27, 2025
Viaarxiv icon

Enhancing Generalization of Speech Large Language Models with Multi-Task Behavior Imitation and Speech-Text Interleaving

Add code
May 24, 2025
Viaarxiv icon

A Semantic Information-based Hierarchical Speech Enhancement Method Using Factorized Codec and Diffusion Model

Add code
May 20, 2025
Viaarxiv icon

ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code Generation

Add code
Mar 10, 2025
Viaarxiv icon

ASurvey: Spatiotemporal Consistency in Video Generation

Add code
Feb 25, 2025
Viaarxiv icon

Exploiting Epistemic Uncertainty in Cold-Start Recommendation Systems

Add code
Feb 22, 2025
Viaarxiv icon

Learning-based A Posteriori Speech Presence Probability Estimation and Applications

Add code
Jan 23, 2025
Figure 1 for Learning-based A Posteriori Speech Presence Probability Estimation and Applications
Figure 2 for Learning-based A Posteriori Speech Presence Probability Estimation and Applications
Figure 3 for Learning-based A Posteriori Speech Presence Probability Estimation and Applications
Figure 4 for Learning-based A Posteriori Speech Presence Probability Estimation and Applications
Viaarxiv icon

Rethinking Membership Inference Attacks Against Transfer Learning

Add code
Jan 20, 2025
Viaarxiv icon

Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data

Add code
Jan 19, 2025
Figure 1 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 2 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 3 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Figure 4 for Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data
Viaarxiv icon