Picture for Xubo Liu

Xubo Liu

PolyBench: A Benchmark for Compositional Reasoning in Polyphonic Audio

Add code
Mar 05, 2026
Viaarxiv icon

From "What" to "How": Constrained Reasoning for Autoregressive Image Generation

Add code
Mar 03, 2026
Viaarxiv icon

Scaling Speech Tokenizers with Diffusion Autoencoders

Add code
Feb 06, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models

Add code
Nov 10, 2025
Viaarxiv icon

Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation

Add code
Aug 10, 2025
Viaarxiv icon

AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion

Add code
May 28, 2025
Figure 1 for AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Figure 2 for AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Figure 3 for AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Figure 4 for AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Viaarxiv icon

ProDS: Preference-oriented Data Selection for Instruction Tuning

Add code
May 19, 2025
Viaarxiv icon

Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture

Add code
Apr 21, 2025
Figure 1 for Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture
Figure 2 for Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture
Figure 3 for Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture
Figure 4 for Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture
Viaarxiv icon

NCL-CIR: Noise-aware Contrastive Learning for Composed Image Retrieval

Add code
Apr 06, 2025
Viaarxiv icon