Picture for Yuchen Zhou

Yuchen Zhou

Neo

Walk the Talk: Bridging the Reasoning-Action Gap for Thinking with Images via Multimodal Agentic Policy Optimization

Add code
Apr 08, 2026
Viaarxiv icon

A Synthetic Eye Movement Dataset for Script Reading Detection: Real Trajectory Replay on a 3D Simulator

Add code
Apr 07, 2026
Viaarxiv icon

AIForge-Doc: A Benchmark for Detecting AI-Forged Tampering in Financial and Form Documents

Add code
Feb 24, 2026
Viaarxiv icon

How well are open sourced AI-generated image detection models out-of-the-box: A comprehensive benchmark study

Add code
Feb 08, 2026
Viaarxiv icon

Optimal Convergence Analysis of DDPM for General Distributions

Add code
Oct 31, 2025
Viaarxiv icon

RePainter: Empowering E-commerce Object Removal via Spatial-matting Reinforcement Learning

Add code
Oct 09, 2025
Viaarxiv icon

ORIC: Benchmarking Object Recognition in Incongruous Context for Large Vision-Language Models

Add code
Sep 19, 2025
Viaarxiv icon

Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning

Add code
Aug 11, 2025
Figure 1 for Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning
Figure 2 for Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning
Figure 3 for Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning
Figure 4 for Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning
Viaarxiv icon

Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning

Add code
Aug 01, 2025
Figure 1 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 2 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 3 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 4 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Viaarxiv icon

DA-Occ: Efficient 3D Voxel Occupancy Prediction via Directional 2D for Geometric Structure Preservation

Add code
Jul 31, 2025
Figure 1 for DA-Occ: Efficient 3D Voxel Occupancy Prediction via Directional 2D for Geometric Structure Preservation
Figure 2 for DA-Occ: Efficient 3D Voxel Occupancy Prediction via Directional 2D for Geometric Structure Preservation
Figure 3 for DA-Occ: Efficient 3D Voxel Occupancy Prediction via Directional 2D for Geometric Structure Preservation
Figure 4 for DA-Occ: Efficient 3D Voxel Occupancy Prediction via Directional 2D for Geometric Structure Preservation
Viaarxiv icon