Picture for Xiaolin Hu

Xiaolin Hu

Department of Computer Science and Technology, Tsinghua University, Beijing, China

MiVE: Multiscale Vision-language features for reference-guided video Editing

Add code
May 14, 2026
Viaarxiv icon

How Mobile World Model Guides GUI Agents?

Add code
May 11, 2026
Viaarxiv icon

Physical Adversarial Clothing Evades Visible-Thermal Detectors via Non-Overlapping RGB-T Pattern

Add code
May 06, 2026
Viaarxiv icon

A Brain-Inspired Deep Separation Network for Single Channel Raman Spectra Unmixing

Add code
Apr 24, 2026
Viaarxiv icon

DanceCrafter: Fine-Grained Text-Driven Controllable Dance Generation via Choreographic Syntax

Add code
Apr 20, 2026
Viaarxiv icon

Defending against Patch-Based and Texture-Based Adversarial Attacks with Spectral Decomposition

Add code
Apr 12, 2026
Viaarxiv icon

RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation

Add code
Feb 10, 2026
Viaarxiv icon

A$^2$-LLM: An End-to-end Conversational Audio Avatar Large Language Model

Add code
Feb 04, 2026
Viaarxiv icon

Beyond the Black Box: Theory and Mechanism of Large Language Models

Add code
Jan 06, 2026
Viaarxiv icon

PartImageNet++ Dataset: Enhancing Visual Models with High-Quality Part Annotations

Add code
Jan 04, 2026
Viaarxiv icon