Picture for Xiao Yang

Xiao Yang

Toward Polymorphic Backdoor against Semantic Communication via Intensity-Based Poisoning

Add code
Apr 25, 2026
Viaarxiv icon

CDPR: Cross-modal Diffusion with Polarization for Reliable Monocular Depth Estimation

Add code
Apr 13, 2026
Viaarxiv icon

Agent^2 RL-Bench: Can LLM Agents Engineer Agentic RL Post-Training?

Add code
Apr 12, 2026
Viaarxiv icon

LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model

Add code
Apr 02, 2026
Viaarxiv icon

Mind over Space: Can Multimodal Large Language Models Mentally Navigate?

Add code
Mar 23, 2026
Viaarxiv icon

OmniEarth: A Benchmark for Evaluating Vision-Language Models in Geospatial Tasks

Add code
Mar 10, 2026
Viaarxiv icon

GeoAlignCLIP: Enhancing Fine-Grained Vision-Language Alignment in Remote Sensing via Multi-Granular Consistency Learning

Add code
Mar 10, 2026
Viaarxiv icon

Helios: Real Real-Time Long Video Generation Model

Add code
Mar 04, 2026
Viaarxiv icon

VidDoS: Universal Denial-of-Service Attack on Video-based Large Language Models

Add code
Mar 02, 2026
Viaarxiv icon

Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search

Add code
Mar 02, 2026
Viaarxiv icon