Picture for Xiaofei Wang

Xiaofei Wang

Audio-Aware Large Language Models as Judges for Speaking Styles

Add code
Jun 06, 2025
Viaarxiv icon

Towards Autonomous In-situ Soil Sampling and Mapping in Large-Scale Agricultural Environments

Add code
Jun 06, 2025
Viaarxiv icon

Phi-Omni-ST: A multimodal language model for direct speech-to-speech translation

Add code
Jun 04, 2025
Viaarxiv icon

Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model

Add code
Jun 04, 2025
Viaarxiv icon

Sentinel: Scheduling Live Streams with Proactive Anomaly Detection in Crowdsourced Cloud-Edge Platforms

Add code
May 29, 2025
Viaarxiv icon

Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling

Add code
May 26, 2025
Viaarxiv icon

Adaptive Spatial Transcriptomics Interpolation via Cross-modal Cross-slice Modeling

Add code
May 15, 2025
Viaarxiv icon

GLRD: Global-Local Collaborative Reason and Debate with PSL for 3D Open-Vocabulary Detection

Add code
Mar 26, 2025
Viaarxiv icon

Zero-Shot Audio-Visual Editing via Cross-Modal Delta Denoising

Add code
Mar 26, 2025
Viaarxiv icon

Joint Modelling Histology and Molecular Markers for Cancer Classification

Add code
Feb 11, 2025
Viaarxiv icon