Picture for Xinyu Cai

Xinyu Cai

IterCAD: An Iterative Multimodal Agent for Visually-Grounded CAD Generation and Editing

Add code
Jun 11, 2026
Viaarxiv icon

Training-Free Acceleration for Document Parsing Vision-Language Model with Hierarchical Speculative Decoding

Add code
Feb 13, 2026
Viaarxiv icon

FineFT: Efficient and Risk-Aware Ensemble Reinforcement Learning for Futures Trading

Add code
Dec 29, 2025
Viaarxiv icon

Learning Only with Images: Visual Reinforcement Learning with Reasoning, Rendering, and Visual Feedback

Add code
Jul 28, 2025
Figure 1 for Learning Only with Images: Visual Reinforcement Learning with Reasoning, Rendering, and Visual Feedback
Figure 2 for Learning Only with Images: Visual Reinforcement Learning with Reasoning, Rendering, and Visual Feedback
Figure 3 for Learning Only with Images: Visual Reinforcement Learning with Reasoning, Rendering, and Visual Feedback
Figure 4 for Learning Only with Images: Visual Reinforcement Learning with Reasoning, Rendering, and Visual Feedback
Viaarxiv icon

Fast-DataShapley: Neural Modeling for Training Data Valuation

Add code
Jun 05, 2025
Viaarxiv icon

GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling

Add code
Apr 30, 2025
Viaarxiv icon

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training

Add code
Dec 16, 2024
Figure 1 for GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Figure 2 for GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Figure 3 for GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Figure 4 for GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Viaarxiv icon

Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving

Add code
May 24, 2024
Figure 1 for Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving
Figure 2 for Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving
Figure 3 for Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving
Figure 4 for Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving
Viaarxiv icon

A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist

Add code
Feb 29, 2024
Figure 1 for A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist
Figure 2 for A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist
Figure 3 for A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist
Figure 4 for A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist
Viaarxiv icon

OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving

Add code
Feb 06, 2024
Viaarxiv icon