Picture for Xiaofei Wang

Xiaofei Wang

ProImage-Bench: Rubric-Based Evaluation for Professional Image Generation

Add code
Dec 13, 2025
Viaarxiv icon

ScaleDL: Towards Scalable and Efficient Runtime Prediction for Distributed Deep Learning Workloads

Add code
Nov 13, 2025
Viaarxiv icon

SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models

Add code
Oct 08, 2025
Figure 1 for SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Figure 2 for SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Figure 3 for SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Figure 4 for SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Viaarxiv icon

Improving Practical Aspects of End-to-End Multi-Talker Speech Recognition for Online and Offline Scenarios

Add code
Jun 17, 2025
Viaarxiv icon

MetaEformer: Unveiling and Leveraging Meta-patterns for Complex and Dynamic Systems Load Forecasting

Add code
Jun 15, 2025
Viaarxiv icon

Audio-Aware Large Language Models as Judges for Speaking Styles

Add code
Jun 06, 2025
Viaarxiv icon

Towards Autonomous In-situ Soil Sampling and Mapping in Large-Scale Agricultural Environments

Add code
Jun 06, 2025
Viaarxiv icon

Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model

Add code
Jun 04, 2025
Figure 1 for Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model
Figure 2 for Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model
Figure 3 for Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model
Figure 4 for Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model
Viaarxiv icon

Phi-Omni-ST: A multimodal language model for direct speech-to-speech translation

Add code
Jun 04, 2025
Viaarxiv icon

Sentinel: Scheduling Live Streams with Proactive Anomaly Detection in Crowdsourced Cloud-Edge Platforms

Add code
May 29, 2025
Viaarxiv icon