Picture for Xiangtai Li

Xiangtai Li

CyberV: Cybernetics for Test-time Scaling in Video Understanding

Add code
Jun 09, 2025
Viaarxiv icon

DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers

Add code
May 30, 2025
Viaarxiv icon

Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models

Add code
May 30, 2025
Viaarxiv icon

PixelThink: Towards Efficient Chain-of-Pixel Reasoning

Add code
May 29, 2025
Viaarxiv icon

Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Add code
May 29, 2025
Viaarxiv icon

So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection

Add code
May 24, 2025
Viaarxiv icon

Conditional Panoramic Image Generation via Masked Autoregressive Modeling

Add code
May 22, 2025
Viaarxiv icon

BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation

Add code
May 19, 2025
Viaarxiv icon

On Path to Multimodal Generalist: General-Level and General-Bench

Add code
May 07, 2025
Viaarxiv icon

Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook

Add code
May 01, 2025
Viaarxiv icon