Picture for Wentao Zhang

Wentao Zhang

Small-Large Collaboration: Training-efficient Concept Personalization for Large VLM using a Meta Personalized Small VLM

Add code
Aug 10, 2025
Viaarxiv icon

M2IO-R1: An Efficient RL-Enhanced Reasoning Framework for Multimodal Retrieval Augmented Multimodal Generation

Add code
Aug 08, 2025
Viaarxiv icon

Decoupling Continual Semantic Segmentation

Add code
Aug 07, 2025
Viaarxiv icon

PilotRL: Training Language Model Agents via Global Planning-Guided Progressive Reinforcement Learning

Add code
Aug 01, 2025
Viaarxiv icon

CausalStep: A Benchmark for Explicit Stepwise Causal Reasoning in Videos

Add code
Jul 22, 2025
Viaarxiv icon

Sparse Causal Discovery with Generative Intervention for Unsupervised Graph Domain Adaptation

Add code
Jul 10, 2025
Viaarxiv icon

Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models

Add code
Jun 15, 2025
Viaarxiv icon

AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving

Add code
Jun 14, 2025
Viaarxiv icon

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

Add code
Jun 09, 2025
Viaarxiv icon

Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification

Add code
Jun 08, 2025
Viaarxiv icon