Picture for Liang Wang

Liang Wang

Institute of Automation, CAS

Chatting with Images for Introspective Visual Thinking

Add code
Feb 12, 2026
Viaarxiv icon

Beyond Closed-Pool Video Retrieval: A Benchmark and Agent Framework for Real-World Video Search and Moment Localization

Add code
Feb 10, 2026
Viaarxiv icon

Human Identification at a Distance: Challenges, Methods and Results on the Competition HID 2025

Add code
Feb 07, 2026
Viaarxiv icon

PaperX: A Unified Framework for Multimodal Academic Presentation Generation with Scholar DAG

Add code
Feb 05, 2026
Viaarxiv icon

Understanding LLM Evaluator Behavior: A Structured Multi-Evaluator Framework for Merchant Risk Assessment

Add code
Feb 04, 2026
Viaarxiv icon

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Add code
Feb 04, 2026
Viaarxiv icon

BridgeV2W: Bridging Video Generation Models to Embodied World Models via Embodiment Masks

Add code
Feb 03, 2026
Viaarxiv icon

How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing

Add code
Feb 02, 2026
Viaarxiv icon

CURP: Codebook-based Continuous User Representation for Personalized Generation with LLMs

Add code
Jan 31, 2026
Viaarxiv icon

ShotFinder: Imagination-Driven Open-Domain Video Shot Retrieval via Web Search

Add code
Jan 30, 2026
Viaarxiv icon