Picture for Xiaoyi Dong

Xiaoyi Dong

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

Add code
Jun 08, 2026
Viaarxiv icon

Mean Flow Policy Optimization

Add code
Apr 16, 2026
Viaarxiv icon

Visual Self-Refine: A Pixel-Guided Paradigm for Accurate Chart Parsing

Add code
Feb 18, 2026
Viaarxiv icon

Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition

Add code
Feb 09, 2026
Viaarxiv icon

VISTA: Enhancing Visual Conditioning via Track-Following Preference Optimization in Vision-Language-Action Models

Add code
Feb 04, 2026
Viaarxiv icon

Think Visually, Reason Textually: Vision-Language Synergy in ARC

Add code
Nov 19, 2025
Figure 1 for Think Visually, Reason Textually: Vision-Language Synergy in ARC
Figure 2 for Think Visually, Reason Textually: Vision-Language Synergy in ARC
Figure 3 for Think Visually, Reason Textually: Vision-Language Synergy in ARC
Figure 4 for Think Visually, Reason Textually: Vision-Language Synergy in ARC
Viaarxiv icon

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

Add code
Oct 31, 2025
Viaarxiv icon

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Add code
Sep 26, 2025
Viaarxiv icon

SPARK: Synergistic Policy And Reward Co-Evolving Framework

Add code
Sep 26, 2025
Figure 1 for SPARK: Synergistic Policy And Reward Co-Evolving Framework
Figure 2 for SPARK: Synergistic Policy And Reward Co-Evolving Framework
Figure 3 for SPARK: Synergistic Policy And Reward Co-Evolving Framework
Figure 4 for SPARK: Synergistic Policy And Reward Co-Evolving Framework
Viaarxiv icon

CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

Add code
Sep 26, 2025
Viaarxiv icon