Picture for Bo Zhang

Bo Zhang

DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving

Add code
May 25, 2025
Viaarxiv icon

Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning

Add code
May 24, 2025
Figure 1 for Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning
Figure 2 for Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning
Figure 3 for Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning
Figure 4 for Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning
Viaarxiv icon

QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization

Add code
May 23, 2025
Viaarxiv icon

NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification

Add code
May 22, 2025
Figure 1 for NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification
Figure 2 for NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification
Figure 3 for NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification
Figure 4 for NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification
Viaarxiv icon

LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models

Add code
May 21, 2025
Viaarxiv icon

Flash-VL 2B: Optimizing Vision-Language Model Performance for Ultra-Low Latency and High Throughput

Add code
May 14, 2025
Viaarxiv icon

GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling

Add code
Apr 30, 2025
Viaarxiv icon

TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving

Add code
Apr 22, 2025
Figure 1 for TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Figure 2 for TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Figure 3 for TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Figure 4 for TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Viaarxiv icon

Stochastic Gradient Descent in Non-Convex Problems: Asymptotic Convergence with Relaxed Step-Size via Stopping Time Methods

Add code
Apr 17, 2025
Viaarxiv icon

Object Placement for Anything

Add code
Apr 16, 2025
Viaarxiv icon